oleavr/_.md Secret

## _.md

      
    Raw
  

              _.md
            
          
I strongly believe that the concepts of a language runtime and "dynamic introspection" should be fundamentally decoupled.

That we agree on, and is precisely how I designed Frida.

The architecture of Frida, and I'm going to be extremely blunt here (...snip...) is a disorganized mess that mixes all of these layers together in ways that I find completely unacceptable.

Frida has a modular architecture that is highly decoupled in nature, and you can use it à la carte. You can grab frida-gum, the instrumentation core, and use it from C. This gives you access to function hooking, introspection of loaded libraries, their exports, mapped memory ranges, etc. You can also use this through one of its two language bindings. Either from C++ via Gum++,
or JavaScript via GumJS. Now, for a lot of applications you need a way to inject your own code into another process that you want to introspect or instrument. This is where frida-core provides an injector per OS, which is not in any way coupled to one specific payload. (As a side-note, on Mac and iOS frida-gum provides an out-of-process dynamic
linker that frida-core's injector uses to map your .dylib into sandboxed processes.) These injectors are not currently exposed through that public API, but the plan is to expose them, there just hasn't been much demand for it. So, recognizing that most applications actually just want an easy way to use Gum's APIs from the inside of another process, we do provide a high-level API where you can simply say "run this piece of JavaScript with full access to Gum's APIs from the inside of that other process, and let me optionally exchange JSON messages with it". This simply composes the previously mentioned components to take care of the nitty gritty for you, i.e. packages GumJS into a shared library, frida-agent, which it injects using the platform-specific injector, and does the necessary RPC for you to instantiate scripts and exchanges messages. This high-level API is also exposed through multiple language bindings, like Python, Node.js, Swift, .NET, and Qt/Qml.
So, going back to the standalone injection, a lot of applications will also need a bi-directional communications channel. Once they've implemented that, they'll run into problems like this. This is where frida-core's internal pipe library gives you a portable transport that uses a low-level API on each platform, e.g. mach ports on Mac and iOS, a
named pipe on Windows, etc. This is a lot of complexity to burden every single application with, and Cycript is just one of many that will have to solve this problem.

(as you keep making tons of annoying and arbitrary claims about my work while simultaneously never giving me credit for things I pioneer, so I don't really feel like you deserve kid gloves)

My blog post opened by saying Cycript is awesome and that you created it. As far as Frida goes, it was created before I had ever heard of your project, so the way I see it no credit is due there.

If you want to provide Frida's function hooking to Cycript, write a language binding for it... that's all Substrate is to Cycript: a module you can import.

This was among the options I considered, and is really straight-forward to do, but I took a step back and saw a DSL wanting to come out, I saw a type system that didn't have to be written in C++ and tied to JavaScriptCore, and applications beyond an interactive console (which is awesome in itself, but it doesn't have to be coupled).
The thinking behind GumJS is to provide a lean and mean runtime that gives you some essential APIs, and then make it easy for people to
compile their own scripts by composing them from Frida-specific modules developed by the community, e.g. for tracing APIs, interacting with UIKit, grabbing screenshots, etc., while also giving you access to thousands of generic modules from npm. We do currently have two Frida-specific modules built in, specifically Objective-C and Java, but the plan is to move these out to their own modules in npm.

If you want to use Frida's process injection to support Cycript injecting into other processes on Windows or on Linux, you should note that cycript literally just runs, as an external process, cynject, which is a tool from Substrate which provides only process injection. As far as I know Frida doesn't have anything similar, but you should.

As mentioned earlier this is already there and the plan is to expose it, but there hasn't been much demand for it. Last time I had to do this it was in order to get Cycript loaded into a sandboxed process, so I quickly cobbled together this in a matter of minutes.

As it stands, Cycript is actually already extremely portable: it requires readline, JavaScriptCore, and libffi. It doesn't have any concept of assembly outside of libffi. It has no machine-specific concepts embedded into it. Its implementation of Objective-C bindings works on GNU Objective-C as well as Apple's runtime. Its implementation of Java bindings works almost entirely at the JNI level and works with both Google's runtimes for Android as well as Oracle's official runtimes.

For sure a lot of impressive work, I am merely suggesting how to make it even more portable. E.g. it does use a fair amount of GNU C extensions, POSIX APIs that don't exist on Windows, implements its own transport, needs to deal with the architecture-dependent quirks of objc_msgSend (stret, fpret), etc.

Given this context--that Frida's architecture seems almost hopelessly coupled

Except it isn't, as I debunked earlier.

--I want to examine your claims of a performance improvement by doing this: you seriously are linking to a comparison of one underutilized feature of Substrate vs. Frida. Putting aside for a second that I'd be surprised if most features of Substrate aren't actually faster than Frida, Cycript's language bindings are almost certainly faster than Frida's as Frida seems to be implementing its FFI layer in JavaScript :/.

This is incorrect. Mjølner is simply implementing a Cycript-compatible type system on top of the bare metal libffi API provided by GumJS. This just boils down to expressing Memory.writeU32(dimensions.add(8), 640) as dimensions->width = 640 (i.e. dimensions.$cyi.width = 640 after compilation).

I am going to repeat: Cycript is a programming environment.

That's what it is, but couldn't it be more if you decoupled the pieces?

It is in many ways quite comparable to Python, and as such it is the implementation of a language called "Cycript" (where the hell did you come up with "cylang"?!?).

I was trying to disambiguate the language from the interactive console. Just as Python's REPL does not have to be part of its runtime.

Of course, a more obvious comparison is to node.js, but it is a weird mix of syntax designed to let you slide between semantics of various other programming languages. It integrates these syntax features to provide seamless and fluent bindings to Objective-C and Java.

I understand your point, and what you have built is awesome, but does that mean it should not evolve to the next level of awesomeness? :-) It's fine that we disagree on what precisely that is, though.

That's all Cycript is, and I've been careful to remove, not add, dependencies or concepts of "dynamic introspection" from Cycript. Cycript is extremely portable, and is only going to get more portable over time. As an example of this, right now Cycript has a -p argument which internally resolves a process name into a process identifier. That functionality should not be in cycript: instead, that functionality should be in cynject, which is part of Substrate. I actually have had that on my todo list the past two weeks.

That's great, but things are still highly coupled. But OTOH tighter integration vs highly decoupled architectures both have their pros and cons. Tigher integration obviously gives you more vertical control.

Moving that from Cycript to Substrate means that more complex ways of resolving processes, such as Frida's device target, would also be supported in a very natural manner... of course, assuming Frida had a stand-alone injection mechanism Cycript could run instead of cynject, which it doesn't seem to have. I would be more than happy to provide a way to specify the name of the code injection tool to use is (which would be really clean now that Cycript no longer relies on socket backchannels).

As mentioned earlier this is trivially exposable and I plan to do that, but there just hasn't been much demand for it, as requesters typically end up realizing that Frida already has a higher-level solution to their problem: "Oh, those hooks don't have to be written in C, JavaScript with a feedback loop of milliseconds makes me so much more productive, and no need for my own platform-specific communication channels, etc."

I have friends who are on the ECMAScript standardization committee, and I care a lot about implementation details in the JavaScript runtime. The use cases and vision I have for Cycript are things which may eventually require modifications to the runtime to get the kind of performance I want binding across different VMs. Future versions of Cycript might not generate vanilla JavaScript.

That's fine, there's definitely more than one way to skin this cat.

I'm not going to merge a massive dependency on Frida, when Frida is literally just v8/duktape and doesn't have any benefits.

Agree to disagree on this one, as already explained. And if footprint is a concern it is trivial to build GumJS without V8, which is what we currently do for embedded targets. It is also trivial to build frida-core without the local backend, if you only care about remote iOS devices for example.

The reverse, however, just doesn't seem to be true: I don't understand why you have mingled all these parts together, and I don't understand why you are keen to keep them together.

They are not mingled together, as explained earlier.

If you provided tiny tools instead of a massive wad of stuff, Frida would probably get more use in the field.

Again, this is just not true. And it is being heavily used in the field. People are building tools on top of it, e.g., in no particular order:
https://github.com/dpnishant/appmon
https://github.com/mwrlabs/needle
https://github.com/antojoseph/diff-gui
https://github.com/AndroidSecurityTools/lobotomy
https://immunityproducts.blogspot.no/2015_09_01_archive.html
https://github.com/Nightbringer21/fridump
https://github.com/OALabs/frida-extract
https://github.com/nowsecure/r2frida
etc.
There are also companies building products on top of it. This is precisely the kind of things Frida was designed for – a platform for cross-platform dynamic instrumentation. Use it through its low-level building blocks or a higher level API.

It also would let you use Cycript as is and yet still have all of your more-cross-platform function hooking and more-cross-platform code injection functionality. That architecture is just so beautiful and clean :(.

This is already possible, and I would encourage you to do it, but this fork is about exploring an even deeper integration where I tried to reimagine Cycript by recognizing that there is a lot of overlap. Both approaches have their pros and cons, as I mentioned regarding vertical control.


Ability to attach to sandboxed apps on Mac, without touching /usr or modifying the system in any way;
Other than turning off SEP, I don't think Cycript still requires this as of a few weeks ago? The same changes I made for iOS 9.3 to cynject I believe also bypass all of these weird problems on Mac as well (and at 360|iDev I was working with people who had Cycript in a random folder in their home directory and we were able to inject into sandboxed apps, but maybe I wasn't testing an app with a sufficiently strong sandbox).


Actually s/sandboxed apps/sandboxed processes/ to be more precise, e.g. a daemon without any filesystem access.


Instead of crashing the process if you make a mistake and access a bad pointer, you will get a JavaScript exception;
It is not clear to me this is actually a good thing, though I sometimes consider it; this would be a trivial thing to add to Cycript.


It's trivial, in theory, but this is where you quickly end up with OS and arch-specific quirks. Already a solved problem in GumJS.


Frida's function hooking is able to hook many functions not supported by Cydia Substrate.
This literally has nothing to do with Cycript.


It is however what comes bundled with Cycript, and I have yet to speak to a Cycript user who did not do their function hooking through MS.

If nothing else, I'm going to strongly ask that you rename your project from "cycript" to "frida-cycript" or something like that, in the same way that Microsoft calls their fork of nodejs nodejs-chakracore. Someone might start using your fork, and then start talking about what it can and can't do, or provide code examples using it, and you are going to undermine people being able to talk about Cycript. People who find your fork should know that it isn't actually Cycript: it is extremely unrelated, really.

We wholeheartedly agree on this point. It just seemed premature to do this before we knew whether there's a chance you're interested in merging our changes. It's now been renamed frida-cycript.