DVM Class loading principle :

DEX File loaded into memory DvmDex After structure , We haven't finished parsing the class yet , We will DEX Classes in the ClassObject Structure is called class loading .

ClassObject Used to describe a complete class , among Method Structure is used to describe the methods of a class :


struct ClassObject : Object {
-- snip --
/* static, private, and <init> methods */
int directMethodCount;
Method* directMethods;
/* virtual methods defined in this class; invoked through vtable */
int virtualMethodCount;
Method* virtualMethods;
-- snip --
};

It contains the instruction position pointer :

struct Method {
-- snip --
/* the actual code */
const u2* insns; /* instructions, in memory-mapped .dex */
-- snip --
};

Android DVM Three kinds of loading methods are provided :

(1) Use  Class.forName  Explicitly load
(2) Use  ClassLoader.loadClass  Explicitly load
(3) Implicitly load , such as new The operator , When the corresponding class is not accessed , Implicit loading occurs

among ,

Class.forName  call DVM Of  Dalvik_java_lang_Class_classForName  function ;
ClassLoader.loadClass  call  Dalvik_dalvik_system_DexFile_defineClassNative  function ;
Implicitly loading calls  dvmResolveClass  function ;

The call relationship is as follows :

DEX dump opportunity :

Observe  Dalvik_dalvik_system_DexFile_defineClassNative  Function implementation :

static void Dalvik_dalvik_system_DexFile_defineClassNative(const u4* args,
JValue* pResult)
{
-- snip --
if (pDexOrJar->isDex)
pDvmDex = dvmGetRawDexFileDex(pDexOrJar->pRawDexFile);
else
pDvmDex = dvmGetJarFileDex(pDexOrJar->pJarFile); clazz = dvmDefineClass(pDvmDex, descriptor, loader);
-- snip --
}

The reasons for choosing this function to insert shelling code are as follows :

  1. The function is in the critical path
    No matter what kind of loading method , It is bound to be carried out to Dalvik_dalvik_system_DexFile_defineClassNative function

  2. Include in memory DEX structure , And through pDexOrJar->fileName matching APK
    adopt pDvmDex = dvmGetJarFileDex(pDexOrJar->pJarFile) Code , It's in memory DEX File structure information , This information includes :

/*
* Internal struct for managing DexFile.
*/
struct DexOrJar {
char* fileName; // Unique String, It can be used to hit the target to be shelled APP
bool isDex;
bool okayToFree;
RawDexFile* pRawDexFile;
JarFile* pJarFile; // In the memory zip(APK) File structure
u1* pDexMemory; // malloc()ed memory, if any
};

pDvmDex  Represents an open ODEX file ,DvmDex  The structure has a  memMap  member , Used to represent ODEX The memory information corresponding to the file :

/*
* Some additional VM data structures that are associated with the DEX file.
*/
struct DvmDex {
-- snip --
/* shared memory region with file contents */
bool isMappedReadOnly;
MemMapping memMap;
-- snip --
};

among  addr  Represents the starting address of this memory ,length  Represents the size of this memory :

/*
* Use this to keep track of mapped segments.
*/
struct MemMapping {
void* addr; /* start of data */
size_t length; /* length of data */ void* baseAddr; /* page-aligned base address */
size_t baseLength; /* length of mapping */
};

to want to dump The goal is DEX, Just match  pDexOrJar->fileName  To the corresponding  fileName  when , adopt  memMap->addr  and  memMap->length  Locate the ODEX Memory location for , dump Just come out .

ODEX The document is to improve DVM Designed for operational efficiency , It does this by referring to framework APIs Replace with preload  vtable  The index of , Improve the efficiency of method search and operation , therefore ODEX It is strongly related to specific equipment , More specifically, with  /system/framework  In the catalog odex The document is strongly relevant .

adopt  backsmali  And  /system/framework  Next odex, Can will ODEX Revert to DEX file , There are a lot of materials about this on the Internet , I won't repeat .

Solve a few problems :

at present , There is still a concern : A class is loaded , Its initialization ( Such as <clinit>) It may not be implemented yet .

because  <clinit>  Execute before any other class method , So the reinforcement program can be used in  <clinit>  Do something about it , Realize the dynamic modification of method instructions , Before initialization, we dump Out of DEX It could be totally wrong .

The solution is , Use  dvmDefineClass  Traverse DEX All the classes , adopt  dvmIsClassInitialized  Determine whether the class has been initialized , And call  dvmInitClass  Initiate all classes . So the classes in memory , It's all initialized , Then you can dump A relatively correct ODEX file .

There is another problem :

Marked in the picture  code_off  It means a  direct_method  The instruction bytecode of the method is relative to ODEX Head offset , And its value range can be in ODEX Out of memory , So if it's based solely on  memMap  Of  addr  and  length  Conduct dump, Key instruction data may be missing . The solution is that it will not be ODEX The instructions in the memory area are stored separately as a extra Attached to the document is dump Out of ODEX after , And repair  code_off  Equioffset .

Dexhunter Weakness :

DexHunter By inserting code during class loading , Actively traverse and initialize all classes , And then the memory dump. But after class initialization is complete ,DVM There is no guarantee that the method instruction is correct .

So confrontation DexHunter One way to do this is , Select... To restore the command Dalvik_dalvik_system_DexFile_defineClassNative Function execution finished , A position before a method instruction is executed , such as Hook dvmDefineClass function .

Another way is to do it yourself Dalvik_dalvik_system_DexFile_defineClassNative function , adopt pDexOrJar->fileName You can find ,360 It may have been a similar approach .

Dalvik Source reading notes ( Two ) More articles about

  1. werkzeug Source reading notes ( Two ) Next

    wsgi.py---- The second part pop_path_info() function Let's test the function first : >>> from werkzeug.wsgi import pop_path_info ...

  2. werkzeug Source reading notes ( Two ) On

    Because the first part is about initialization , I didn't publish it ~ wsgi.py---- The first part Before analyzing this module , Need to know WSGI, I'll go on after having a general understanding ~ get_current_url() function Very clear ...

  3. Detectron2 Source reading notes -( Two )Registry&amp;build_* Method

    ​ Trainer analysis We continue Detectron2 Code reading notes -( One ) The content in . The picture above shows detectron2 Three subfolders in a folder (tools,config,engine) The relationship between . So the rest ...

  4. Dalvik Source reading notes ( One )

    dalvik The virtual machine boot entry is in JNI_CreateJavaVM(), At the end of the process JNIEnv After setting the environment , call dvmStartup() Function for real DVM initialization . jint JNI_Cre ...

  5. Android Source code reading note 2 Message processing mechanism

    Message processing mechanism : .MessageQueue: Used to describe message queues 2.Looper: Used to create message queues 3.Handler: Used to send message queues initialization : . adopt Looper.prepare() Create a Loope ...

  6. Apollo Source reading notes ( Two )

    Apollo Source reading notes ( Two ) front Analysis of the apollo Configure settings to Spring Of environment The process of , This article continues PropertySourcesProcessor.postProcessBeanF ...

  7. 【 primary 】FMDB Source code reading ( Two )

    [ primary ]FMDB Source code reading ( Two ) Please indicate the source of this article -- polobymulberry- Blog Garden 1. Preface The last one was just a brief passing FMDB The basic flow of a simple example , There is no reference to FMDB All aspects of , Than ...

  8. Three.js Source reading notes -5

    Core::Ray This class is used to represent “ ray ”, It's mainly used for collision detection . THREE.Ray = function ( origin, direction ) { this.origin = ( or ...

  9. jdk Source reading notes -LinkedHashMap

    Map yes Java collection framework An important part of , especially HashMap It is the most used collection in our daily development process . But unfortunately , Store in HashMap All the elements are disordered , ...

Random recommendation

  1. 【BZOJ1688】[Usaco2005 Open]Disease Manangement Disease management Pressure DP

    [BZOJ1688][Usaco2005 Open]Disease Manangement Disease management Description Alas! A set of D (1 <= D <= 15) ...

  2. Use Volley Perform network data transfer

    First you need to instantiate a RequestQueue RequestQueue queue = Volley.newRequestQueue(this); And then it's based on the URL Request string response String u ...

  3. The simplest example shows wait、notify、notifyAll How to use

    wait().notify().notifyAll() There are three definitions in Object Methods in class , Can be used to control the state of the thread . All three methods eventually call jvm Class native Method . With jvm There may be some differences in the operating platforms ...

  4. What I know WEB Development (1)

    When I started to get in touch with website development , In the concept, there is a simple distinction between static websites and dynamic websites , Static websites are just pure HTML Webpage , Dynamic websites need to adopt asp Connect to database ( such as access). At that time, it was said that experts all used Notepad ...

  5. explore a mystery ReSharper8.1 In the version Architecture( Architecture Tools ) Improvement

    stay ReSharper 8.0 In the new version , There's one called Architecture( Structural tools ) New functions , This function is defined as project dependency analysis . The goal is to allow users to visualize the structure of the solution . Next , Xiaobian will ReSharper ...

  6. Front end layout Flex grammar

    The front end layout has always been CSS A key application of , However, the traditional layout scheme based on box model , rely on display + position + float attribute , It's very inconvenient for some special layouts , such as : Vertical centering is not easy to achieve . In response to this situation ...

  7. Python - SIP Reference guide - Introduce

    Introduce This article is about SIP4.18 Reference guide for .SIP It's a kind of Python Tools , Used to automatically generate Python And C.C++ Library Binding .SIP Originally in 1998 Annual use PyQt Developed , be used for Python And Qt GUI toolki ...

  8. Python_02

    Python Judgment statement   if,while if ture: print(1) else: print(0) for Loops and embedded functions range() range(a,b,c)   a: The starting position   b: End ...

  9. Windows Next use Diskpart format U disc

    step Get into Diskpartdiskpart List all disks list disk Choose Disk select disk clear clean create primary partition creat partition parimary Activate the current partition a ...

  10. hitTest,UIWindow sendEvent ,touchbegan, Response chain

    https://developer.apple.com/documentation/uikit/touches_presses_and_gestures/using_responders_and_th ...