Becoming an AI Developer Without the Math PhD: A Practical Journey into LLMs, Agents, and Real-World Tools

For the past year, the world has been obsessed with what artificial intelligence can do for us. From ChatGPT writing emails to MidJourney generating fantastical images, the dominant narrative has been "how to use AI." But what if you're not satisfied just prompting models? What if you want to build them, customize them, run them offline, and deploy them securely in the cloud?

This is the journey I'm starting now: learning to build with AI, not just use it. And in this post, I’ll lay out the core principles, motivations, and roadmap that will guide my exploration into becoming an AI developer—with a specific focus on LLMs (Large Language Models), agents, training workflows, and cloud/offline deployment.

Let me be clear: I’m not here to write a research paper, derive equations, or become a machine learning theorist. I don’t need to build a transformer from scratch in NumPy. My goal is pragmatic:

I want to learn how to train, run, integrate, and deploy powerful AI tools in real-world environments.

Let’s break this down into the pillars of this journey.

1. Becoming an AI Developer: The New Craft

Today, being an AI developer means being part software engineer, part ML engineer, part systems architect, and part product thinker. The good news is, you don’t need a PhD. You need curiosity, hands-on time, and a practical mindset.

My focus isn’t on AI theory. It’s on:

Training or fine-tuning models (LLMs, vision models, etc.)
Running models locally and in the cloud
Implementing agents (think: multi-step reasoning systems or API-integrated workflows)
Integrating LLMs with tools like search engines, file systems, knowledge bases
Keeping everything secure, fast, and understandable

This is not about using AI to write a blog post. It’s about building the system that understands your files, fetches your emails, or answers your customer questions—and knowing exactly how it works.

2. Training Models: Fine-Tuning and Beyond

One of my early goals is learning how to take an open-source model (like Meta’s LLaMA or Mistral) and customize it. I’m not aiming for full-scale training on terabytes of data—but rather:

Fine-tuning a model on domain-specific content
Learning how to do parameter-efficient tuning (like LoRA, QLoRA)
Using datasets I care about (technical documents, customer support logs, etc.)

I plan to start by running these fine-tuning jobs offline or on AWS EC2 with GPUs, using tools like HuggingFace’s transformers, peft, and trl. I’ll try small models first (e.g., 7B or 3B parameter models), and work my way up to more complex tuning pipelines.

Why? Because having a model that understands your language, your products, and your workflows is the difference between a toy and a tool.

3. Implementing Agents: Orchestrating Reasoning + Tools

The next area I want to explore is agents. Not just chatbots, but smart, tool-using, context-aware systems.

For example:

A file assistant that can read and answer questions about local Markdown, PDF, or code files
A developer agent that can call APIs, search the web, and use Bash to automate tasks
A customer support AI that integrates with ticket systems and logs

These are powered by LLMs + memory + tool use (sometimes called RAG, or Retrieval-Augmented Generation).

I plan to explore:

I want to understand what agents can actually do, where the hallucinations and reliability issues are, and how to make them robust in production.

4. Running Models Offline

This one is personal: I want full local control. I don’t want every prompt or file to be uploaded to OpenAI or Anthropic.

So, I plan to run:

Quantized LLMs using llama.cpp, text-generation-webui, or koboldcpp
Vision models and Stable Diffusion locally using ComfyUI
Agents that talk to local tools and use local embeddings

This lets me:

Experiment without latency or cost limits
Keep things private and airgapped
Build a stack that could run on edge devices or in secure environments

I already have a compact AMD-based mini PC. But for heavy lifting, I’ll rent EC2 GPU instances (e.g., g5.xlarge) to train and test models with full CUDA support.

5. Learning to Use the Cloud: AWS Bedrock and Beyond

While I love offline setups, I also want to master cloud-native AI development:

Using Amazon Bedrock to call hosted models like Claude or Titan
Deploying my own models with SageMaker JumpStart, ml.m5 instances, or even ECS with NVIDIA GPU support
Building LLM-powered APIs with Lambda + API Gateway + DynamoDB
Learning MLOps workflows: model tracking, deployment, versioning

The cloud is where scale, availability, and security live. I want to understand:

Which models to use when
Cost vs performance tradeoffs
How to handle burst loads and real users

This means exploring Bedrock, SageMaker, EC2 Spot training jobs, and maybe even multi-region deployment patterns.

6. Comparing AWS EC2 AI-Capable Instances

Before investing in expensive local GPUs, I explored what AWS EC2 has to offer. These instances provide access to powerful NVIDIA GPUs with full CUDA stack, allowing you to experiment with real-world training, inference, and deployment scenarios.

Here's a snapshot comparison of EC2 instances ideal for AI workloads:

Instance Type	GPU Model	VRAM	Price (On-Demand)	Price (Spot)	Notes
`g5.xlarge`	NVIDIA A10G	24 GB	~$1.00/hr	~$0.35/hr	Best balance of power, cost, and CUDA support for SDXL and LLMs
`g4dn.xlarge`	NVIDIA T4	16 GB	~$0.60/hr	~$0.20/hr	Lower-end option, fine for SD 1.5 or small LLMs
`g6.xlarge`	NVIDIA L4	24 GB	~$1.10/hr	~$0.40/hr	CUDA 12.2 support, fast and efficient
`p3.2xlarge`	NVIDIA V100	16 GB	~$3.00/hr	~$1.00/hr	Older, more expensive, still fast
`p4d.24xlarge`	NVIDIA A100 x8	320 GB	~$32.00/hr	~$9.00/hr	Extremely powerful, best for multi-GPU training

All of these instances let you:

Run full SDXL pipelines, even with ControlNet or LoRA tuning
Fine-tune LLMs with transformers, peft, or trl
Benchmark inference latency vs cost
Build production-like setups with Docker, FastAPI, or LangChain

Pro Tips:

Use Spot Instances when possible to save up to 70%.
Store models in S3 and mount using EFS or copy to EBS for reuse.
Use Deep Learning AMIs or containers preinstalled with PyTorch + CUDA.
For ephemeral jobs, consider Auto Termination scripts to avoid cost surprises.

This approach gives you true NVIDIA GPU experience, and it’s perfect for trial runs before committing to building your own $2,000+ local rig.

7. Skipping the Math (Mostly)

I’ve said it before: I don’t need to know how to implement backpropagation from scratch. I care about model behavior, training workflows, and integration patterns, not the underlying calculus.

That said, I’ll still learn the basics of:

Tokenization (and what it means for prompts)
Attention mechanisms (at a conceptual level)
Embeddings and vector search
Fine-tuning configs and hyperparameters

But if the choice is between reading a 60-page paper on gradient descent vs. getting a working RAG agent running in AWS… I’m picking the agent.

8. What’s Next

In the coming weeks and months, I’ll be sharing hands-on walkthroughs of what I’m building and learning:

Building your first local RAG agent with llama.cpp
Fine-tuning a 7B model on custom markdown docs
Deploying an LLM-powered customer support API in AWS Lambda
Using Amazon Bedrock with multiple foundation models in one app

I’ll post every success, failure, and bottleneck—because I know I’m not the only software engineer tired of prompt engineering and ready to get into real AI building.

If you're on the same journey—to build with AI, not just use it—follow along. This is going to be a practical, grounded, and developer-first look at how to make LLMs and agents actually work for you.

Search This Blog

NexTechTide