NVIDIA Jetson Thor and Blackwell B100: How to Run 120 Billion-Parameter AI Models Locally?

🌐 🇵🇱 Polski · 🇬🇧 EN

NVIDIA’s new superchips, the Blackwell B100 and its mobile counterpart Jetson Thor, are unlocking unprecedented capabilities for running large AI models locally. Contrary to the common belief that such computations require cloud access, NVIDIA demonstrates that a powerful SoC and the right software are all you need. Are we witnessing the end of the cloud dependency era? We explore what these chips can truly achieve, their benefits, and the challenges ahead for potential users.

Wizualizacja superchipu NVIDIA Jetson Thor z holograficzną siecią neuronową i strumieniami danych, otoczonego przez obwody i światło, symbolizująca lokalne przetwarzanie AI — NVIDIA Jetson Thor with a holographic neural network representation—symbolizing local AI and a revolution in data processing.

The Local AI Revolution: Blackwell and Jetson Thor Enter the Stage

In March 2024, during GTC 2024, NVIDIA unveiled the Blackwell architecture—successor to the popular Hopper H100. Alongside it came Jetson Thor, a mobile version of the superchip designed for edge devices. While most media focused on the server-grade B100, it is Jetson Thor that could prove to be a game-changer for businesses, institutions, and even individual users looking to run 120-billion-parameter AI models locally without cloud dependency.

Why is this so significant? Until now, running AI models of this scale locally was nearly impossible due to hardware and energy constraints. NVIDIA’s new chips change the game by delivering performance comparable to some cloud solutions at a fraction of the cost and with far greater flexibility.

Under the Hood: Key Technical Specifications

Before diving into the benefits, let’s examine what makes Blackwell and Jetson Thor so revolutionary. Here are the most critical technical features:

1. Blackwell B100 – The Heart of the Next Generation

Process technology: 4NP (4-nanometer TSMC process optimized for AI). This is a key factor enabling more transistors to be packed in while reducing power consumption.
Architecture: Blackwell is built on next-gen tensor processing cores (TPCs) capable of handling FP4, FP8, and BF16 precision. This allows the chip to achieve a theoretical performance of 20 PFLOPS in FP4 or 10 PFLOPS in FP8.
Memory: Up to 192 GB HBM3E (High Bandwidth Memory), enabling fast loading of large models. Memory bandwidth reaches 1.4 TB/s.
Power consumption: ~1200W TDP for a single B100. While this seems high, it’s still significantly lower than competing chips, which often exceed 1500W.
Connectivity: Supports NVLink 5.0 (1.8 TB/s bandwidth per link), allowing multiple chips to be linked for increased performance.

2. Jetson Thor – The Mobile Counterpart

The mobile version of the superchip, designed for edge devices, boasts the following specifications:

Computational power: ~100 TOPS (INT4) or 50 TOPS (FP8). This is sufficient to run models up to 120 billion parameters with proper optimization.
Power consumption: ~100W, enabling passive or active cooling depending on the application—critical for portable devices.
Memory: 64 GB LPDDR5X, providing adequate capacity for mid-sized models.
Connectivity: Supports PCIe 5.0 and CXL 3.0, allowing integration with other system components.
Release timeline: The developer version of Jetson Thor is expected in Q4 2024, with mass production slated for early 2025.

“Jetson Thor is the first fully integrated chip that enables running 120-billion-parameter AI models on a 100W device. Until now, this was only possible in the cloud.”

– Jensen Huang, NVIDIA CEO, during GTC 2024

Why Running 120B Models Locally Makes Sense

The ability to run such large models locally offers a range of benefits that go beyond mere convenience. Here are the most significant ones:

1. Reduced Latency and Improved Responsiveness

One of the biggest issues with cloud-based AI is response time. Sending a query to an AI model, processing it in the cloud, and receiving a response can take hundreds of milliseconds. With Jetson Thor, response times drop to under 10 milliseconds—critical for applications requiring immediate reactions, such as:

autonomous systems (e.g., cars, drones),
medicine (e.g., real-time medical imaging),
industry (e.g., quality control on production lines).

2. Energy and Cost Efficiency

While Jetson Thor and Blackwell B100 consume significant power, they are far more efficient than running models in the cloud.

Running a 120B model in the cloud (e.g., on an AWS p4d.24xlarge instance) can cost hundreds of dollars per hour, whereas local execution on Jetson Thor costs a few hundred dollars per year (including hardware depreciation).
Cloud energy consumption is estimated at ~10 kWh per 1,000 queries, while Jetson Thor uses ~1 kWh per 1,000 queries—a 90% reduction.

3. Privacy and Data Security

Processing data locally eliminates the need to transmit it to external servers, which is crucial for:

sensitive data (e.g., medical, financial),
classified information (e.g., military, government),
user privacy (e.g., voice assistants, text translation).

According to NVIDIA’s 2024 “AI at the Edge” report, over 70% of companies cite data security as the primary reason for adopting local AI solutions.

4. Scalability and Flexibility

Jetson Thor allows up to 8 chips to be linked in a single device via NVLink 5.0. This enables performance scaling based on needs, which is difficult to achieve in the cloud due to cost and instance availability constraints.

For example, the NVIDIA DRIVE Thor system, slated for autonomous vehicles in mid-2025, enables simultaneous processing of data from multiple sensors (cameras, LiDAR, radar) with minimal latency.

If you’re interested in seeing how local AI deployment works in practice, check out our post on Google Gemma 4 12B – The Local AI Revolution, where we discuss hardware requirements and capabilities for one of the most popular models.

5. Offline Operation Capability

One of the biggest advantages of local AI is independence from internet connectivity. This is critical for:

mobile devices (e.g., drones, cars),
areas with poor network coverage (e.g., construction sites, wind farms),
crisis situations (e.g., search-and-rescue operations in areas with no signal).

Who Benefits from Blackwell and Jetson Thor? Official Use Cases

NVIDIA has positioned its new chips as solutions for various industries. Here are the most important officially announced use cases:

1. Healthcare – Real-Time Medical Imaging

A collaboration with Siemens Healthineers enabled the deployment of the Swine Transformer 3D model on Jetson Thor. This model analyzes X-rays and CT scans in real time, detecting abnormalities. According to NVIDIA, this solution reduces diagnosis time by 70% compared to traditional methods.

“In medicine, time is critical. Running AI models locally enables faster decision-making, which can save lives.”

2. Industry 4.0 – Quality Control on Production Lines

NVIDIA is working with Foxconn, one of the largest electronics manufacturers, to integrate Jetson Thor into factories. An optimized AI model detects product defects using camera images. According to Foxconn, Jetson Thor reduced defective products by 40% and shortened inspection time by 60%.

“In Industry 4.0, speed and precision are key. Jetson Thor lets us respond to production issues instantly.”

3. Autonomous Vehicles – Sensor Data Processing

Autonomous vehicles generate massive amounts of data from cameras, LiDAR, and radar. Traditional cloud solutions are ill-suited for real-time processing due to latency. The NVIDIA DRIVE Thor, based on Blackwell, enables local data processing with minimal delay. According to NVIDIA, DRIVE Thor can handle up to 2,000 TOPS at under 400W power consumption.

4. Education – Offline AI Assistants

NVIDIA has partnered with Harvard University on a project to introduce local AI into lecture halls. Students can use translation models, writing assistants, and text analysis tools without internet access. Early tests show this solution reduces infrastructure costs by 80% compared to traditional cloud-based setups.

5. Military Sector – Drone Data Analysis

A collaboration with Lockheed Martin focuses on using Jetson Thor in military drones. AI models analyze sensor data in real time, detect threats, and make decisions without sending data to a central hub. According to NVIDIA, this reduces dependence on satellite connectivity and increases system resilience to cyberattacks.

Which AI Models Are Optimized for Jetson Thor and Blackwell?

NVIDIA not only provides hardware but also software to maximize the potential of Jetson Thor and Blackwell. Here are the most important AI models optimized for these chips:

1. Llama 3 117B (Meta)

Llama 3 117B is one of the largest open language models, optimized by NVIDIA for Jetson Thor. According to NVIDIA tests, the model achieves 50 tokens per second at FP8 precision on Jetson Thor—fast enough for real-time conversations.

2. Mistral 8x22B (Mistral AI)

Mistral 8x22B is a 141-billion-parameter model designed for efficiency. Thanks to Mixture of Experts (MoE) technology, it delivers high performance with lower hardware requirements. NVIDIA optimized it for Blackwell B100, achieving 20 tokens per second in FP8.

3. Nemotron-4 15B (NVIDIA)

Nemotron-4 15B is a text-generation model created by NVIDIA and available in NVIDIA AI Foundation Models. It’s optimized for Jetson Thor and achieves 100 tokens per second at INT4 precision.

4. Other Models and Frameworks

NVIDIA also supports other models and frameworks compatible with Jetson Thor and Blackwell:

TensorRT-LLM – A framework for optimizing language models to maximize hardware performance.
vllm – A framework for running large language models with optimal memory management.
PyTorch 2.3+ – The latest PyTorch version supporting Blackwell’s new hardware instructions.
NVIDIA AI Foundation Models – A catalog of AI models optimized by NVIDIA and ready for use on Jetson Thor and Blackwell.

If you want to learn more about optimizing language models, check out our post on GLM-5.2 – The New Leader Among Open AI Models?

Competition: Who Else Is Trying to Catch Up with NVIDIA?

NVIDIA isn’t the only player in the AI chip market. In recent years, AMD, Intel, Google, and Qualcomm have introduced their own solutions. How do they compare to Blackwell and Jetson Thor?

1. AMD Instinct MI325X

Computational power: 250 TOPS (INT8).
Price: ~$15,000.
Pros: Affordable, open software (ROCm), strong general-purpose performance.
Cons: Lower performance in language models (LLMs) compared to Blackwell.

2. Intel Gaudi 3

Computational power: 183 TOPS (BF16).
Price: ~$10,000.
Pros: Affordable, PyTorch support, strong general-purpose performance.
Cons: Smaller LLM ecosystem, higher power consumption.

3. Google TPU v5e

Computational power: 260 TOPS (BF16).
Price: Cloud-only (Google Cloud).
Pros: High cloud performance, seamless Google ecosystem integration.
Cons: No edge solutions, cloud dependency.

4. Qualcomm Cloud AI 100

Computational power: 400 TOPS (INT4).
Price: ~$3,500.
Pros: Low cost, strong performance in mobile devices.
Cons: Limited support for large models (e.g., 120B+).

When comparing these solutions, NVIDIA stands out primarily for its performance in language models (LLMs) and support for local deployment. Competitors focus mainly on price (Intel, Qualcomm) or cloud solutions (Google).

Challenges and Limitations: What Could Hinder the Revolution?

Despite their enormous potential, Blackwell and Jetson Thor face several challenges. Here are the most critical ones:

1. Memory – The Biggest Challenge for 120B Models

A 120-billion-parameter model in FP8 precision requires ~240 GB of memory—over six times the RAM in a standard laptop. While Jetson Thor offers 64 GB LPDDR5X, in practice, NVLink 5.0 or external HBM3E memory is often necessary.

“Memory is the biggest challenge in running large models locally. Without proper hardware support, the models simply won’t work.”

2. Cooling – A Critical Factor for Stable Operation

Jetson Thor consumes ~100W, which may seem modest but requires proper cooling in enclosed spaces (e.g., cars or industrial devices). NVIDIA offers both active and passive cooling systems, though some applications may require liquid cooling.

3. Cost – Not for Every Budget

The developer version of Jetson Thor is expected to cost $5,000–$8,000. While this is far less than server-grade Blackwell, it’s still a significant investment that may deter smaller companies and individual users. NVIDIA offers leasing and subscription options (e.g., NVIDIA AI Enterprise) to reduce costs.

4. Software – The Need for Optimization

While NVIDIA provides tools like TensorRT-LLM and vllm, running a 120B model on Jetson Thor requires optimization. This involves knowledge of quantization (e.g., FP8, INT4), pruning, and distillation. Not every user will be able to optimize models independently.

“Model optimization is an art. Without the right expertise, even the best hardware won’t deliver.”

5. Availability – Priority for Enterprise Customers

Initially, Jetson Thor will be available mainly to business customers and institutions. Mass production and availability for individual users are planned for early 2025. This means the solution isn’t yet accessible to a broader audience.

When and Where Will It Be Available? NVIDIA’s Pricing and Plans

NVIDIA’s timelines and availability for Blackwell and Jetson Thor are closely tied to market demand. Here are the key details:

1. Blackwell B100

Release date: Q3 2024 (for OEM customers and partners).
Availability: Primarily for server manufacturers (e.g., Dell, HPE).
Price: Estimated at $12,000–$15,000 per chip.

2. Jetson Thor

Release date:
- Developer kit (Jetson Thor Developer Kit): Q4 2024.
- Mass production: early 2025.
Availability:
- Direct sales via NVIDIA Store.
- Distributors (e.g., Arrow, Avnet).
Price: Estimated at $5,000–$8,000 for the developer kit.
Long-term licenses: Available for business customers (e.g., NVIDIA AI Enterprise).

3. NVIDIA DRIVE Thor

Release date: H1 2025.
Availability: Mass production for automotive manufacturers.
Price: Estimated at $3,000–$6,000 (long-term licenses).

To stay updated on release dates and availability, follow NVIDIA’s official channels, such as the NVIDIA Jetson Thor Developer Portal.

Is Investing in Jetson Thor and Blackwell Worth It? Summary

NVIDIA Jetson Thor and Blackwell B100 are undeniably groundbreaking solutions for local AI. They enable running 120-billion-parameter models directly on devices, opening doors to entirely new applications. Here are the key takeaways:

Pros

Performance: Blackwell B100 delivers 20 PFLOPS in FP4, while Jetson Thor handles 120B+ models.
Efficiency: Local deployment saves energy and costs compared to cloud solutions.
Privacy: No need to transmit data to external servers.
Speed: Response times under 10 ms—critical for many applications.
Scalability: Multiple chips can be linked via NVLink 5.0.

Cons and Challenges

Cost: Jetson Thor is a several-thousand-dollar investment.
Memory: 120B models require large memory capacities, which may be an issue in some applications.
Software: Model optimization is necessary, requiring specialized knowledge.
Availability: Currently, chips are available mainly to business customers.

Who Is This Solution For?

Jetson Thor and Blackwell B100 are ideal for:

Companies in healthcare, industry, automotive, and defense.
Educational institutions looking to introduce local AI into lecture halls.
AI enthusiasts experimenting with large models without cloud dependency.
Developers needing high-performance hardware for testing and optimizing models.

If you’re curious about how local AI works in practice, check out our post on AI in Polish Business – Where Local Companies See Opportunities and Where They Hit a Wall, where we analyze how Polish companies are adopting local AI.

The Future of Local AI: What’s Next?

NVIDIA Jetson Thor and Blackwell B100 are just the beginning of the local AI revolution. In the coming years, we can expect even more powerful and energy-efficient chips. Here are some trends that may shape the future:

1. Miniaturization of Chips

NVIDIA is already working on the next generation of chips that will be even smaller and more energy-efficient. Rumors suggest the next Jetson iteration could consume under 50W while delivering similar performance.

2. Better Integration with Devices

Future AI chips will be more tightly integrated with components like sensors, cameras, and modems, enabling even faster and more efficient data processing.

3. Support for More Models

NVIDIA is continuously optimizing more AI models for Jetson Thor and Blackwell. In the coming years, we may see support for models up to 300 billion parameters.

4. Competition from Other Players

While NVIDIA dominates the AI chip market, competition from AMD, Intel, and Qualcomm could bring innovative solutions. This could lead to lower prices and better options for users.

To stay updated on the latest AI trends, check out our post on Google I/O 2025 – AI at the Core of Google’s Strategy, where we analyze how tech giants are shaping the future of AI.