Breaking Encryption: A Deep Dive into MegaCloud Extraction and Deobfuscation
2025-06-20
Abstract
This is the story of how I, someone with little prior experience in this field, tackled the MegaCloud video player's encryption. In this report, I document my journey of reverse-engineering their obfuscated JavaScript to extract the AES decryption keys. I'll cover the sudden switch from WASM to JavaScript, the confusing obfuscation techniques I had to learn about (like Control Flow Flattening), and how I eventually built my own automated deobfuscator,
megacloud-key-extractor
, using Babel. I also touch on my frustrating attempts to use AI for help, and what I learned along the way.
Table of Contents
- 1. Introduction to MegaCloud Encryption
- 1.5 What's the Big Deal?
- 2. The Deobfuscator's Philosophy & Toolkit
- 3. The Great Upheaval
- 4. Challenges and Dead Ends
- 5. Decoding the Obfuscation
- 6. Encryption Implementation
- 7. Deobfuscation Pipeline
- 8. Key Generation Techniques
- 9. Obfuscation Techniques
- 10. Key Extraction Process
- 11. Conclusion and The Use of AI
- Acknowledgments
1. Introduction to MegaCloud Encryption
MegaCloud is used by Hianime, but the decryption scheme is not limited to this site. It has been found in various others in similar forms, indicating some common implementation across multiple platforms. For a while, Hianime has been using a custom video player that encrypted its video stream URLs. And up until now, the decryption method was generated and obfuscated within WASM (WebAssembly).
Recently, the MegaCloud player has transitioned to a JavaScript-based encryption scheme that is heavily obfuscated. This change has made a couple of things possible:
- The decryption method can now be parsed directly in JavaScript, rather than relying on WASM decompilation. (Helps with deobfuscation.)
- The decryption scheme is not tied to any specific identifier and can be used across different streams.
Note: This tool was only tested on Hianime, but could potentially be adapted to other contexts.
1.5 What's the Big Deal? (Hianime, MegaCloud, and Video Players)
So, what are we even talking about? Let's break it down.
- Hianime: This is a popular website for streaming anime. Like many similar sites, it doesn't host the video files itself. Instead, it embeds a video player from a third-party service.
- MegaCloud: This is the third-party service that hosts the video files and provides the player. When you watch an episode on Hianime, you're actually streaming it from MegaCloud's servers.
- The Player & Encryption: To protect their content, MegaCloud doesn't just give you a direct link to the video file. Instead, the link is encrypted. The webpage loads a special JavaScript "player" that contains the code needed to decrypt this link and play the video.
So, why would we want to break this? By extracting the decryption key, we can get the raw video stream URL. This is useful for a few reasons:
- Watching in other players: You can use the URL in a desktop player like VLC, which often provides a better experience.
- Archiving: It allows you to save a copy of the stream.
- Bypassing player limitations: It can help avoid ads, tracking, or other unwanted features built into the web player.
Essentially, this whole process is about regaining control over the video stream.
2. The Deobfuscator's Philosophy & Toolkit
Deobfuscating a target like this is less about a specific list of tools and more about an approach: programmatic analysis. I quickly learned that the obfuscation is designed to defeat manual inspection, so the only scalable solution is to fight code with code. The toolkit, therefore, is built around this philosophy:
- ASTs are Everything: The Abstract Syntax Tree is the battlefield. Instead of seeing a wall of text, you see a structured, parsable graph of the code's logic. Tools like AST Explorer are indispensable for initial reconnaissance, but the real work is done with code.
- Programmatic Transformation with Babel: To modify the AST, you need a powerful and flexible tool. Babel became the engine of my deobfuscator. The entire process is a sequence of custom Babel plugins, each designed to identify and reverse one specific layer of obfuscation.
- The Debugger as a Last Resort: When static and programmatic analysis failed, a debugger was the final tool for stepping through the code line-by-line. It's slow and painful but was sometimes necessary to understand a particularly tricky piece of the live execution flow, especially when I was just starting out.
3. The Great Upheaval (WASM to JavaScript obfuscation)
For the longest time, those who dared to reverse-engineer MegaCloud's player were greeted by a wall of WebAssembly (WASM). It was a black box, but someone soon reverse-engineered the code, and we took it for granted. Then, one day, everything changed. The WASM was gone, replaced by a sprawling, chaotic mess of obfuscated JavaScript 😭.
At first, this seemed like a welcome change. JavaScript is, after all, more transparent than compiled WASM. But this initial optimism quickly faded, giving way to a collective headache. The JavaScript was mangled, obfuscated, and completely unreadable.
Here's a small taste of the nightmare that developers were suddenly faced with:
Q[309497]=(function(){var l1=2;for(;l1 !== 9;){switch(l1){case 2:l1=typeof globalThis === '\x6f\x62\u006a\x65\x63\u0074'?1:5;break;case 1:return globalThis;break;case 5:var c2;try{var b4=2;for(;b4 !== 6;){switch(b4){case 9:delete c2['\x74\u0039\u0048\u0078\x53'];var T_=Object['\x70\x72\u006f\u0074\x6f\x74\x79\x70\u0065'];delete
T_['\u0052\x38\u0041\u0050\x7a'];b4=6;break;case 3:throw "";b4=9;break;case 4:b4=typeof t9HxS === '\u0075\u006e\x64\x65\x66\x69\u006e\x65\x64'?3:9;break;case 2:Object['\x64\x65\x66\x69\x006e\x65\x50\u0072\x6f\x70\x65\x72\u0074\x79'](Object['\x70\x72\u006f\u0074\x6f\x74\x79\x70\u0065'],'\x52\u0038\u0041\u0050\x7a',{'get':functio
n(){return this;},'\x63\x6f\x6e\x66\x69\x67\x75\x72\x61\x62\x6c\x65':true});c2=R8APz;c2['\x74\u0039\u0048\u0078\x53']=c2;b4=4;break;}}}catch(A9){c2=window;}return c2;break;}}})();Q.y9=y9;N7b(Q[309497]);Q[46016]=(function(){var z0$=2;for(;z0$ !== 5;){switch(z0$){case 2:var R1a={u7ODdj6:(function(l1q){var v_d=2;for(;v_d !== 18;){
switch(v_d){case 13:(I$t++,I11++);v_d=8;break;case 8:v_d=I$t < Q3v.length?7:12;break;case 2:var n1T=function(H62){var s7J=2;for(;s7J !== 11;){switch(s7J){case 4:var o7z=0;s7J=3;break;case 13:s7J=!a9w?6:12;break;case 9:S8I[o7z]=t_m(H62[o7z] + 98);s7J=8;break;case 2:var t_m=Q.k2();var X86=Q.B$();
/.. Hundreds lines more of obfuscated code .../
This code, with its cryptic variable names, hex-encoded strings, and flattened control flow, was a far cry from structured WASM. It required a completely new approach, a new set of tools, and a lot of patience.
4. Challenges and Dead Ends: The Road to Understanding
The path to a working extractor wasn't a straight line. The initial confrontation with the obfuscated JavaScript led to several failed attempts and dead ends, each one a lesson in what not to do.
My early attempts to use simple regular expressions to find the key failed spectacularly. The computed property names and encoded strings meant there was no consistent, predictable pattern to latch onto. Another dead end was using generic tools. Running the code through a formatter like Prettier was a first step, but it only produced a neatly formatted wall of nonsense. In a code editor, it just looked like a long list of meaningless variable assignments and function calls. The code was readable, but the logic was still a tangled mess of switch
statements and array lookups that made no sense to a human reader.
I also looked into more powerful, automated deobfuscators like Webcrack. While incredibly impressive, the obfuscator used by MegaCloud seemed to have its own unique flavour that these general-purpose tools couldn't fully unravel out-of-the-box. They would get a very small part of the way, but yielded no real gain, and the constant changes to the obfuscation meant that a tailored solution was the only reliable path forward.
These failures made it clear that a surface-level approach was doomed. The only way forward was to parse the code into an Abstract Syntax Tree (AST) and attack its structure programmatically.
5. Decoding the Obfuscation: CFF and Computed Properties
After the initial shock, the next step was to dissect this new beast. Two techniques immediately stood out as the cornerstones of this obfuscation: Control Flow Flattening (CFF) and Computed Property Names combined with IIFEs.
Control Flow Flattening (CFF)
What is it?
Control Flow Flattening is a technique that dismantles the natural, linear flow of a program (if-statements, loops, function calls) and hides it inside a single, massive loop with a switch
statement. A state variable is used to determine which block of code (which case
) to execute next, effectively "flattening" the code's structure into a spaghetti of jumps.
Implications & How to Spot It
The primary goal of CFF is to make the code excruciatingly difficult for humans to follow. You can spot it by looking for a while(true)
or for(;;)
loop containing a switch
statement that dispatches to different code blocks based on a variable that's constantly being reassigned. The provided snippet is a classic example of this. For instance, the central dispatcher is immediately visible:
for(;l1 !== 9;){
switch(l1){
case 2:
l1=typeof globalThis === '\x6f\x62\u006a\x65\x63\u0074'?1:5;
break;
case 1:
return globalThis;
break;
// ... more cases
}
}
Here, the variable l1
acts as the state pointer, and its value is manipulated within each case
to determine the next jump, rather than executing code in a linear sequence.
Defeating It Manually tracing the logic is nearly impossible. Defeating CFF requires a programmatic approach to rebuild the original control flow graph by analysing the state transitions. This is a non-trivial task that forms the first major step in any serious deobfuscation effort.
Computed Properties and IIFEs
What is it?
This is a two-pronged attack. Firstly, instead of accessing object properties directly (e.g., console.log
), the obfuscator computes the property name at runtime. Secondly, it uses Immediately Invoked Function Expressions (IIFEs) to encapsulate logic and populate a central "module" object. An IIFE is a function that is defined and executed right away, creating a private scope for its variables.
Implications & How to Spot It This combination makes static analysis incredibly difficult. The code is littered with examples like this:
Q[309497]=(function(){
// ... a lot of complex, scoped logic ...
return some_value_or_function;
})();
This is an IIFE. The entire (function(){...})()
is executed on the spot, and its result is assigned to Q[309497]
. This hides the logic for creating Q[309497]
inside a temporary scope. This is often combined with obfuscated property names using hex or Unicode escape sequences:
c2['\x74\u0039\u0048\u0078\x53']
A simple text search for a function name is useless. You have to first evaluate the computed property (e.g., '\x74\u0039\u0048\u0078\x53'
becomes 't9HxS'
) and then understand what the IIFE assigned to that property actually does.
Defeating It A deobfuscator must be able to perform constant folding to resolve the computed property names. For the IIFEs, it often involves simplifying the body of the function to understand what it returns, and in many cases, replacing the entire IIFE with its final, returned value if possible.
Tackling these layers of obfuscation required a systematic, automated approach. This led me to the work of Ciarán and his e1-player-deobf
tool, which served as a foundational inspiration for the multi-pass deobfuscation pipeline I ended up building.
6. Encryption Implementation
Jumping over to encryption. It utilises the CryptoJS
library, specifically its AES module. The encryption keys are dynamically generated at runtime through the script.
Here's a deobfuscated code example, we can see a typical pattern:
i = l0pTy['CryptoJS'];
X = i['AES']['decrypt'].bind(i['AES']);
r = i['enc']['Utf8'];
Note: All the variables are random and scrambled constantly, so they will not be the same in every instance.
The code does some variation of binding the AES decrypt function and establishing the UTF-8 encoding method. The actual decryption happens (often) in a wrapper function, here's a simplified version:
l0pTy['__z1d'] = b => {
// ...
try {
// ...
R = j();
f = X(b, R); // <--- This is the key extraction function as R=j()
// ...
return f.toString(r);
} catch (n) {
console.error(n);
}
// ...
};
Where j()
is a function that returns the encryption key, and X
is the bound CryptoJS.AES.decrypt
function.
7. Deobfuscation Pipeline
The key to making this work was creating a deobfuscation pipeline. This is what I implemented in my own tool, megacloud-key-extractor
. It looks something like this:
[Obfuscated JavaScript]
|
v
[Pass 1: Unflatten Control Flow] --> [Reconstructs loops/if-statements]
|
v
[Pass 2: Decode String Arrays] --> [Replaces array access with literals]
|
v
[Pass 3: Inline Functions] --> [Simplifies wrapper calls]
|
v
[Pass 4: Constant Folding] --> [Resolves computed properties]
|
v
[Pass 5: Key Extraction] --> [Identifies and extracts the AES key]
|
v
[Clean, Deobfuscated Snippet & Key]
This pipeline, inspired by Ciarán's original tool, includes:
- Normalisation & Unflattening: Simplifies literals and reconstructs control flow
- Data Inlining: Inlines arrays and wrapper functions
- Core Logic Solving: Decodes string arrays and resolves state machines
- String Array Inlining: Replaces string array accessors with literals
- Final Simplification: Cleans up the code by inlining functions and removing dead code
- Key Extraction: Identifies and extracts the AES key
- Courtesy of Ciarán for the original idea and author of the e1-player-deobf tool that inspired my own version.
8. Key Generation Techniques (Example)
8.1 Character Code Arrays
An actual example pattern observed is the use of character code arrays that are converted to strings:
E = [
100, 102, 57, 50, 56, 51, 51, 56, 56, 100, 52, 55, 54, 48, 101, 54, 101, 57, 97, 52, 97, 48, 99, 99, 55, 51, 98,
101, 49, 51, 52, 53, 51, 98, 51, 56, 53, 97, 99, 102, 51, 56, 57, 51, 55, 50, 50, 55, 55, 48, 56, 99, 48, 102, 55,
97, 98, 54, 55, 57, 57, 53, 98, 97
];
j = () => {
o.l6T.R0DdC_o();
if (!o.Z9D.S$gTmE1()) {
return String.fromCharCode(...E); // Converts character codes to string
}
};
This array of character codes is spread into the String.fromCharCode()
function, which converts each code to its corresponding character, producing the AES key.
The "How": Surgical Strikes with AST Pattern Matching
So how does a Babel plugin actually find this pattern? It works by pattern matching against the AST's structure. You write a "visitor" that tells Babel: "I'm looking for a piece of code that has this exact shape."
For the example above, instead of searching for the text String.fromCharCode
, you'd create a visitor to find a CallExpression
(a function call) where the function being called (the callee
) is a MemberExpression
(like object.property
) where the property
is an Identifier
with the name fromCharCode
.
It looks something like this in code:
const myVisitor = {
CallExpression(path) {
// path.node is the current AST node
const callee = path.node.callee;
if (
callee.type === 'MemberExpression' &&
callee.property.type === 'Identifier' &&
callee.property.name === 'fromCharCode'
) {
// Found it! Now do something with it,
// like resolving the array `E` and extracting its values.
console.log('Found a String.fromCharCode call!');
}
}
};
This is incredibly powerful. It allows you to perform surgical strikes, finding and transforming very specific, complex patterns of code while ignoring everything else. This is the core technique used in megacloud-key-extractor
to reverse the obfuscation layer by layer. With the massive trade-off of performance and general accuracy, as the AST traversal is computationally expensive since the logic must run on the numerous branches of code (until exited), and if the structure changes, the pattern matcher can sometimes end up being too rigid and miss the pattern entirely even if it were similar to a known pattern.
8.2 Other Observed Patterns
The key extractor is designed to handle multiple obfuscation techniques:
- Array Join: Keys constructed by joining array elements
- String Manipulation: Keys built through concatenation, slicing, or reversal of strings
- Proxied Functions: Keys generated through chains of function calls
- Indexed Mapping: Keys built by mapping indices to characters
9. Obfuscation Techniques
The obfuscated code employs several defensive techniques:
Object.freeze(i);
Object.freeze(i['AES']);
Object.freeze(i['enc']);
Object.defineProperty(
i['AES'],
'decrypt',
(() => {
if (!o[103210].o2r0mKV()) {
var M6 = { writable: false };
}
// ...
M6.configurable = false;
// ...
return M6;
})()
);
These techniques include:
- Freezing objects to prevent modification
- Making properties non-writable and non-configurable
- Complex conditional logic with obfuscated checks
- State machine operations (side effect of control flow flattening)
10. Key Extraction Process
The extraction process is implemented in the keyExtractionPlugin.js file and follows these steps:
// Initialise result containers
let foundKeys = [];
let nonHexCandidates = [];
let wrongLengthCandidates = [];
// Initialise collection containers (used for summary)
let segmentFunctionsMap = {};
let potentialKeyArrays = {};
10.1 Multi-Pass Extraction Strategy
The extractor works in multiple passes:
- Direct Scan: Looks for common key patterns
- Collection Phase: Collects arrays and segment functions
- Extraction Phase: Applies multiple extraction strategies
10.2 String.fromCharCode Pattern Recognition
The extractor specifically identifies String.fromCharCode()
patterns:
if (item.type && item.type.includes('fromCharCode')) {
//...
}
This helps verify that the extracted key correctly represents the character codes in the source array.
11. Conclusion and The Use of AI
The MegaCloud encryption scheme, while formidable, ultimately yields to a systematic, multi-pass deobfuscation strategy. By programmatically reversing control flow flattening, resolving computed properties, and tracing data through the abstracted code, we can reliably extract the AES keys needed to access the content. This very manual approach, while effective, comes with trade-offs in performance and maintenance.
The modular and extensible design of the extractor allows it to adapt, but the landscape is definitely shifting. The rise of Large Language Models (LLMs) presents a new angle:
- The Promise (Pros): AI-based tools can often "see through" obfuscation without needing a complex, custom-built pipeline. They can generalise better, making them more resilient to minor changes in the obfuscator's output that would break a rigid, rule-based system.
- The Reality (Cons): LLMs can be a "black box" and are not always reliable. They might fail on novel patterns and, perhaps most importantly, they cost money that I... don't have. The reliance on external APIs and the computational cost make them a less accessible tool for hobbyist reverse-engineers.
The future definitely lies in a hybrid approach: using AI for broad-strokes deobfuscation to get a "good enough" version of the code, followed by targeted, hand-crafted scripts for the final extraction. For now, the methodical, AST-based approach remains the most accurate and cost-effective tool for a hobbyist like me.
Practical Frustrations: AI in Plugin Development
While developing the Babel plugins for my deobfuscator, I attempted to leverage LLMs to accelerate the process. The experience was a mix of promise and immense frustration. The AI was surprisingly adept at pattern generation—I could describe a type of obfuscation, and it would generate a plausible Babel plugin structure for identifying and transforming it. It helped me develop new angles of attack when I was stuck.
The failure came in pattern application. When given the actual, messy, obfuscated code, the AI-generated plugins almost always failed. They were too generic, missed edge cases, or hallucinated AST node properties. The core issue was the AI's inability to "target" the pattern successfully within a sea of unrelated, complex code. This led to a frustrating loop: get a good idea from the AI, spend hours manually debugging and rewriting the plugin to work on the real code, and repeat. The AI could point to the needle in the haystack, but it couldn't pick it up.
The End Result
It's a constant cat-and-mouse game, but for now, we can enjoy the temporary pleasure of having a working solution (probably broken by the time you're reading this though). And honestly, figuring this stuff out is pretty cool.
You can find the final tool, megacloud-key-extractor
, on my GitHub. It implements the entire pipeline described here. The README has instructions on how to use it. If you find it useful, please consider starring it on GitHub:
https://github.com/Eggwite/megacloud-key-extractor
Acknowledgments
I want to give a huge shoutout to Ciarán for his work on the e1-player-deobf
tool, which served as the foundation for my own deobfuscator. His insights into the obfuscation techniques were invaluable.