What Is AI?

Artificial intelligence is a broad label for software that performs tasks we associate with human cognition — perceiving, reasoning, deciding, generating language. It is not a single technology. It’s a goal, pursued with many techniques across seven decades.

For an engineer, the useful definition is narrower:

AI today is software that learns its behavior from data instead of having that behavior written explicitly by a programmer.

A spam filter built from hand-written if rules is automation. A spam filter that learned the patterns of spam from a million labeled emails is AI. The difference is where the logic comes from: a human author, or an optimization process fitting a model to examples.

Narrow AI vs. general AI

Almost everything in production today is narrow AI — systems that are superhuman at one task and useless outside it. AlphaGo cannot drive a car. A fraud model cannot summarize a contract. Even a large language model, which feels general, is a narrow system: extremely broad text prediction, but still one trained capability.

Artificial general intelligence (AGI) — a system that matches human flexibility across arbitrary tasks — does not exist. When you read marketing copy, mentally replace “AI” with “a narrow model trained for this specific thing.” It will calibrate your expectations correctly almost every time.

Term	What it means	Status
Narrow AI (ANI)	Superhuman at a specific task	Everywhere in production
General AI (AGI)	Human-level across arbitrary tasks	Research goal, not real
Superintelligence (ASI)	Beyond human across the board	Speculative

A 60-second history

Understanding how the field moved explains why today’s tools look the way they do.

Symbolic AI (1950s–1980s). Intelligence as hand-coded rules and logic — “expert systems.” Powerful in narrow domains, but brittle: humans had to enumerate every rule. It could not scale to messy, real-world inputs.
Statistical machine learning (1990s–2000s). Instead of writing rules, fit them from data. Decision trees, SVMs, and logistic regression powered search ranking, spam filtering, and recommendations. Still required humans to hand-design the input “features.”
Deep learning (2012–2017). Neural networks with many layers learned the features themselves directly from raw pixels, audio, or text. The 2012 ImageNet result kicked off the modern era.
The transformer & foundation models (2017–today). The 2017 transformer architecture made it possible to train enormous models on internet-scale text. These foundation models are trained once at great cost, then adapted to thousands of downstream tasks. LLMs are the headline example.

The throughline: each era moved more of the work from human authoring to automatic learning — first the rules, then the features, then the tasks themselves.

Why AI works now

The core ideas behind neural networks are decades old. Three things changed:

Data. The internet produced text, image, and code corpora large enough to train models with billions of parameters.
Compute. GPUs — built for graphics — turned out to be ideal for the matrix math of neural networks, and got radically cheaper per operation.
Algorithms. The transformer architecture scaled gracefully: more data and compute reliably produced better models (the “scaling laws”).

What this means for you

As an AI engineer you will almost never train a foundation model. You will consume one — via an API or an open-weights checkpoint — and your job is everything around it: choosing the right model, feeding it the right context, constraining its outputs, evaluating quality, and controlling cost and latency.

That is a software engineering job, and it’s what the rest of this guide covers.

Key takeaways

AI is software whose behavior is learned from data. Everything in production is narrow AI; AGI does not exist. The field progressed by automating more of the work — rules, then features, then tasks. Modern AI works because data, compute, and the transformer architecture arrived together. Your role is to build systems around foundation models, not to build the models.