The word "algorithm" appears everywhere: news articles blame algorithms for polarization, companies advertise algorithmic recommendations, and regulators debate algorithmic accountability. Yet most explanations of what an algorithm actually is either stay too abstract to be useful or dive immediately into code that loses most readers.
This article takes a different approach. It explains algorithms in plain language — what they are, the main types, how they scale, and how the algorithms you interact with daily actually work — without requiring any programming background.
The word "algorithm" itself derives from the name of the 9th-century Persian mathematician Muhammad ibn Musa al-Khwarizmi, whose treatise Al-Kitab al-mukhtasar fi hisab al-jabr wa-l-muqabala (from which we also get "algebra") introduced systematic methods for solving linear and quadratic equations. Al-Khwarizmi's step-by-step solution procedures were so influential that the Latinized form of his name — algorismus — came to describe any systematic calculation procedure. The modern computational meaning developed in the 20th century as computer science emerged as a discipline.
What Is an Algorithm?
An algorithm is a finite, step-by-step set of instructions for solving a problem or accomplishing a task. Every algorithm has three components:
- Input: data or information the algorithm starts with
- Process: a defined sequence of steps applied to the input
- Output: the result after the process completes
Algorithms are not unique to computers. Consider a recipe: it takes ingredients (input), specifies a sequence of preparation steps (process), and produces a dish (output). Driving directions are an algorithm. A medical triage checklist is an algorithm. A tax form's calculation instructions are an algorithm.
What makes computer algorithms distinctive is that the instructions must be precise enough for a machine to execute without ambiguity. A recipe that says "add salt to taste" is usable by a human who can taste. A computer program cannot use that instruction — it needs "add 1.5 grams of salt." Computers require complete, unambiguous specification.
Algorithms must also be finite — they must terminate after a definite number of steps, producing an output. An algorithm that runs forever without producing a result has failed, even if each individual step is correct. This seems obvious but is non-trivial in practice: the Halting Problem, proven by Alan Turing in 1936, shows that there is no general algorithm that can determine, for all possible programs and inputs, whether a given program will eventually halt or run forever. This was one of the foundational theoretical results in computer science.
"An algorithm does not think. It executes. The intelligence — or the bias — in an algorithm comes from the humans who designed it, chose what to optimize for, and decided what data to train it on."
Why Algorithms Matter Beyond Computing
Understanding algorithms has become a practical necessity for non-programmers because algorithms now make or influence decisions that directly affect your life:
- What content you see on social media
- Whether your loan application is approved
- Which candidates a company's hiring tool flags for human review
- How your insurance premium is calculated
- Which news stories appear at the top of your search results
- What products are recommended to you and at what price
The scale of algorithmic decision-making is staggering. Facebook's news feed algorithm makes ranking decisions affecting over 3 billion users daily (Meta, 2023). Automated hiring screening tools process millions of job applications annually; a 2018 Harvard Business School report estimated that 88% of employers use automated resume screening before any human reviews an application (Fuller et al., 2021). Credit scoring algorithms influence access to mortgages, car loans, and rental housing for essentially every adult in developed economies.
Algorithmic systems embedded in these decisions are not neutral: they encode choices about what to prioritize, what data to use, and what tradeoffs to make. Understanding how algorithms work in general terms helps you ask better questions about the specific ones that affect you.
Types of Algorithms: The Main Categories
Sorting Algorithms
A sorting algorithm rearranges a collection of items into a defined order. Sorting is one of the most fundamental operations in computing because so many other tasks depend on data being sorted first.
Bubble sort is the simplest: compare pairs of adjacent items and swap them if they are in the wrong order. Repeat until no more swaps are needed. It is easy to understand but extremely slow for large datasets.
Merge sort uses a "divide and conquer" strategy: split the list in half, sort each half recursively, then merge the two sorted halves. It is far more efficient than bubble sort.
Quicksort selects a "pivot" element, partitions all other elements into those smaller and those larger than the pivot, and recursively sorts both partitions. Quicksort is typically the fastest general-purpose sorting algorithm in practice and is used in many programming language standard libraries.
Timsort, developed by Tim Peters in 2002, is a hybrid sorting algorithm derived from merge sort and insertion sort. It was designed to perform well on real-world data, which often contains ordered runs of elements. Timsort is now the default sorting algorithm in Python, Java, and Android — one of those invisible choices that affects billions of executions daily.
Search Algorithms
A search algorithm finds a specific item in a collection.
Linear search is the most basic: check items one by one until you find the target or exhaust the list. It works on any data, sorted or not, but is slow for large collections.
Binary search is dramatically faster — but only works on sorted data. It works by repeatedly halving the search space:
- Look at the middle element of the list
- If it matches the target, you are done
- If the target is smaller, eliminate the upper half and repeat with the lower half
- If the target is larger, eliminate the lower half and repeat with the upper half
Binary search finds any item in a list of one million in at most 20 comparisons. Linear search could take up to one million. The difference is the algorithm — not the hardware.
Hash tables represent a different approach to search: instead of searching through items, they compute a mathematical function (a hash) that maps a key directly to the location where the corresponding value is stored. A well-implemented hash table achieves O(1) lookup — the same time regardless of whether the table contains 100 items or 100 million. Hash tables underlie databases, caches, dictionaries in programming languages, and countless other systems that need fast lookup.
Graph Algorithms
Many real problems involve networks: maps, social connections, computer networks, supply chains. Graph algorithms navigate these structures.
Dijkstra's algorithm (1959), developed by Dutch computer scientist Edsger W. Dijkstra, finds the shortest path between two points in a network — the core technology behind every GPS navigation system. It systematically explores paths from a starting point, always extending the shortest known path first, until the destination is reached. Modern navigation systems use variants of Dijkstra's algorithm combined with heuristics (A* search) to handle road networks with millions of nodes efficiently.
PageRank, developed by Larry Page and Sergey Brin at Stanford, was the foundational algorithm for Google search. It treats the web as a graph where websites are nodes and hyperlinks are edges. A page's rank is determined by how many other pages link to it, weighted by the rank of those linking pages. Pages linked by many high-rank pages get high ranks themselves. The idea was that the collective judgment of the web — expressed through linking behavior — was a better measure of quality than the content of the page alone.
Brin and Page published the initial PageRank paper in 1998 (Brin & Page, 1998), describing the algorithm that formed the basis of Google. They noted that the simple link-counting approach produced dramatically better results than the keyword-density approaches dominant at the time — because links represent deliberate editorial choices, not just textual proximity.
Encryption Algorithms
Encryption algorithms transform readable data (plaintext) into unreadable data (ciphertext) in a way that only authorized parties with the correct key can reverse. Every secure communication — HTTPS, encrypted messaging, digital payments — relies on encryption algorithms.
RSA encryption (named for Rivest, Shamir, and Adleman, 1977) exploits the mathematical asymmetry between multiplying two large prime numbers (easy) and factoring their product back into the original primes (extremely hard). This asymmetry enables public key cryptography: anyone can encrypt a message using a publicly shared key, but only the holder of the corresponding private key can decrypt it.
AES (Advanced Encryption Standard), adopted by the US government in 2001 after an open international competition, is the symmetric encryption algorithm used for the bulk of encrypted data worldwide. Unlike RSA, AES is a symmetric algorithm — the same key encrypts and decrypts — making it far faster for large data volumes.
Machine Learning Algorithms
Machine learning algorithms learn patterns from data rather than following explicitly programmed rules. This distinction is important.
A traditional algorithm for spam filtering might use explicit rules: "If the email contains the words 'free money,' mark as spam." This works until spammers learn the rules and change their language.
A machine learning algorithm for spam filtering trains on millions of labeled examples (spam vs. not spam), identifies statistical patterns that distinguish them, and builds a model that applies those patterns to new emails. It can detect spam patterns that no human explicitly wrote rules for.
Major machine learning algorithm families include:
| Category | What It Does | Example Applications |
|---|---|---|
| Supervised learning | Learns from labeled training data to predict outputs for new inputs | Email spam detection, image recognition, credit scoring |
| Unsupervised learning | Finds patterns in unlabeled data without predefined categories | Customer segmentation, anomaly detection, topic modeling |
| Reinforcement learning | Learns by trial and error, receiving rewards for good actions | Game-playing AI (AlphaGo), robotics, recommendation tuning |
| Deep learning (neural networks) | Learns hierarchical representations of data through layered processing | Natural language processing, image generation, voice recognition |
| Large language models (LLMs) | Predicts likely next tokens from massive text training data | AI assistants, code generation, document summarization |
The last category — large language models — represents the most recent significant shift in what algorithms can do. Models like GPT-4 and its successors are trained on hundreds of billions of words of text and learn statistical patterns so rich that they can write coherently, answer questions, translate languages, and reason through multi-step problems — despite being, at their core, algorithms that predict which word (technically, which token) should come next.
Big O Notation: How Algorithms Scale
The most important concept for comparing algorithms is Big O notation — a way of describing how an algorithm's resource requirements (typically time) grow as the input size grows.
Think of it as answering the question: "If I double the input size, what happens to the processing time?"
| Big O Class | Name | What It Means | Example |
|---|---|---|---|
| O(1) | Constant time | Same time regardless of input size | Looking up a value by key in a hash table |
| O(log n) | Logarithmic | Time grows slowly; doubling input adds one step | Binary search |
| O(n) | Linear | Time grows proportionally with input | Linear search, reading a list |
| O(n log n) | Linearithmic | Slightly worse than linear | Merge sort, quicksort |
| O(n²) | Quadratic | Doubling input quadruples time | Bubble sort, nested loops |
| O(2^n) | Exponential | Doubling input squares the time | Brute-force password cracking |
| O(n!) | Factorial | Grows faster than any polynomial | Brute-force traveling salesman problem |
Why This Matters in Practice
An O(n²) algorithm processing 1,000 items takes 1,000,000 operations. Processing 10,000 items takes 100,000,000 operations — 100 times more work for 10 times more data. At 1,000,000 items, it becomes practically unusable.
An O(n log n) algorithm processing 1,000,000 items takes roughly 20,000,000 operations — manageable. The difference between O(n²) and O(n log n) is the difference between a system that works and one that doesn't, at scale.
This is why choosing the right algorithm matters far more than hardware speed for large-scale problems. You cannot buy your way out of a bad algorithm with faster computers when the input is very large.
A concrete illustration: the Traveling Salesman Problem (find the shortest route visiting all cities exactly once) has no known polynomial-time algorithm. A brute-force approach for 20 cities requires checking 20! = 2.4 quintillion routes — computationally infeasible even on the fastest hardware. The gap between "fast algorithm exists" and "no fast algorithm known" is one of the deepest open questions in computer science: the P vs. NP problem, which asks whether every problem whose solution can be verified quickly can also be solved quickly. It remains unsolved and its resolution would have profound implications for cryptography, optimization, and artificial intelligence.
Recommendation Algorithms: How Netflix, Spotify, and YouTube Decide What You See
Recommendation systems are among the most economically important algorithms ever built. Netflix estimates that its recommendation algorithm drives approximately 80% of content watched on the platform (Gomez-Uribe & Hunt, 2015). Spotify's Discover Weekly, launched in 2015, became one of the most-used features in music streaming history — generating 5 billion streams in its first year. Understanding how they work requires understanding two approaches:
Collaborative Filtering
Collaborative filtering makes recommendations based on the behavior of users similar to you. The logic: if User A and User B have rated 200 films similarly, and User A loved a film that User B hasn't seen, recommend that film to User B.
At scale, this is done using matrix factorization techniques that represent all users and all items as vectors in a mathematical space, where proximity indicates similarity. Users and items that are "close" in this space are considered compatible.
Netflix's early recommendation system was based on a matrix factorization approach that won the famous Netflix Prize — a $1 million competition run from 2006 to 2009, challenging teams to improve Netflix's recommendation algorithm by at least 10%. The winning team, BellKor's Pragmatic Chaos, achieved a 10.06% improvement through a sophisticated ensemble of matrix factorization methods (Koren, Bell & Volinsky, 2009). The competition galvanized academic research on collaborative filtering and the resulting algorithms influenced recommendation system design for years.
Collaborative filtering works well when there is abundant user data. It struggles with the cold start problem: a new user with no history has no similar users to learn from.
Content-Based Filtering
Content-based filtering analyzes the attributes of items you have engaged with and recommends similar items based on those attributes. Spotify's content-based recommendation system analyzes audio features of tracks (tempo, key, energy, danceability, acousticness, speechiness) and recommends tracks with similar characteristics.
Content-based filtering works from the first interaction and does not require learning from other users. Its limitation is that it tends to recommend more of what you already know — it cannot easily introduce genuinely novel tastes.
Hybrid Systems
Modern platforms combine both approaches, often with additional signals:
- Context signals: Time of day, device being used, what activity the user is doing
- Recency weighting: Recent behavior may be more predictive than older behavior
- Diversity injection: Deliberately introducing varied recommendations to avoid filter bubbles
- Engagement optimization: Training models not just on what users click but on longer-term satisfaction signals
- Session-based models: Recurrent neural networks or transformer architectures that model the sequence of interactions in a single session, predicting what comes naturally next
The last point raises a critical issue: what you optimize for determines what the algorithm produces. A recommendation algorithm optimized for clicks will surface click-bait. An algorithm optimized for watch time will surface content that is addictive rather than valuable. An algorithm optimized for user satisfaction three days later will behave differently from one optimized for immediate engagement.
Research by Stray et al. (2021) examining engagement-optimized recommendation algorithms found evidence that platforms optimizing for raw engagement metrics create systematic incentives to surface emotionally provocative content — because such content generates more immediate engagement regardless of whether users report wanting more of it. This feedback loop, rather than any intentional design to cause harm, is the mechanism by which recommendation algorithms can amplify extreme or misleading content.
How Search Algorithms Work
Modern search engines are not a single algorithm but a pipeline of many:
Crawling: Web crawlers (automated bots) follow hyperlinks across the web and download page content for indexing. Google's crawler is called Googlebot; it processes an estimated billions of pages continuously to keep the index current.
Indexing: Downloaded pages are processed, parsed, and stored in a massive inverted index — a data structure that maps every word to the pages that contain it. Google's index reportedly covers hundreds of billions of pages, making it one of the largest data structures ever built.
Ranking: When a user submits a query, the search engine retrieves all indexed pages containing the query terms and ranks them using hundreds of signals. Google's ranking uses over 200 factors, including:
- PageRank (link authority)
- Content relevance and quality signals
- User behavior signals (click-through rate, dwell time)
- Page speed and mobile-friendliness
- Freshness of content for time-sensitive queries
- Semantic understanding of query intent
- E-E-A-T signals (Experience, Expertise, Authoritativeness, Trustworthiness) — particularly weighted for health, financial, and legal content
BERT and neural ranking: Since 2019, Google has used BERT (Bidirectional Encoder Representations from Transformers) — a deep learning model — to understand the semantic meaning of queries and passages, dramatically improving performance on natural language questions. Google stated at BERT's launch that it affected roughly 1 in 10 English language searches (Nayak, 2019). Subsequent neural ranking systems have gone further, with Google's MUM (Multitask Unified Model) and Gemini-based systems extending multimodal understanding to images, video, and cross-language queries.
The Algorithmic Accountability Problem
As algorithms make increasingly consequential decisions, a new field of research has emerged examining how to make algorithmic decision-making fair, interpretable, and accountable.
Algorithmic fairness research has produced multiple competing mathematical definitions of fairness — which, disturbingly, are often mathematically incompatible. Equalized false positive rates (same error rate across demographic groups), equal positive predictive value (predictions equally accurate across groups), and demographic parity (equal selection rates across groups) cannot all be simultaneously satisfied in most real-world scenarios (Chouldechova, 2017). This is not a technical limitation that better engineering can resolve — it is a fundamental mathematical result that means "fair" algorithms require explicit value choices about which form of fairness to prioritize.
Algorithmic explainability addresses the problem that many high-performing machine learning algorithms (particularly deep neural networks) are "black boxes" — even their designers cannot fully explain why they produce a specific output for a specific input. The EU's General Data Protection Regulation (GDPR) includes a right to explanation for automated decisions, which has created legal pressure for interpretable models in consequential decisions (such as credit scoring or parole recommendations).
When Algorithms Get It Wrong: Failure Modes
Understanding how algorithms work also means understanding how they fail.
Feedback loops: A recommendation algorithm that shows users content they engage with trains itself on engagement data — which includes all the things people regret clicking. Engagement signals loop back into training data, potentially amplifying polarizing or emotionally provocative content even if users do not want more of it.
Training data bias: Machine learning algorithms learn from historical data. If that data reflects historical discrimination — as hiring decisions, loan approvals, or policing patterns often do — the algorithm will learn to replicate that discrimination. A landmark 2018 study by Joy Buolamwini and Timnit Gebru (Buolamwini & Gebru, 2018) found that commercial facial recognition systems had substantially higher error rates for darker-skinned women than lighter-skinned men — because training data underrepresented darker-skinned faces, and the systems had learned from a biased sample.
Distribution shift: An algorithm trained on data from one context may fail when deployed in another. A fraud detection model trained on pre-pandemic spending patterns performed poorly in 2020 when spending patterns changed abruptly. A content moderation model trained on English-language toxic speech may perform poorly on non-English languages where labeled training data is scarce.
Goodhart's Law: When a measure becomes a target, it ceases to be a good measure. An algorithm optimizing for a proxy metric (clicks, engagement, five-star ratings) will find ways to maximize that metric that may not serve the underlying goal. This is why Amazon famously discovered that its internal hiring algorithm had learned to penalize resumes containing the word "women's" — because it was trained on historical promotion data in a male-dominated organization, and historical patterns became encoded as targets (Reuters, 2018).
Adversarial attacks: Machine learning systems can be deliberately manipulated by crafting inputs specifically designed to fool them. Researchers have demonstrated that adding small, imperceptible perturbations to images can cause image recognition systems to misclassify objects with high confidence. This is not a theoretical concern: malicious actors manipulating recommendation systems, spam filters, and ad-targeting algorithms is a constant arms race.
Key Takeaways
Algorithms are not magic, not neutral, and not inherently mysterious. They are structured problem-solving procedures, each with characteristics that determine what problems they are suited for and how they scale.
The core concepts that matter for non-programmers:
- An algorithm is a step-by-step process that transforms input into output — nothing more, nothing less.
- Big O notation describes scaling: The difference between O(log n) and O(n²) is the difference between a system that handles a million items easily and one that cannot.
- Recommendation systems combine collaborative and content-based filtering, optimized for signals that may or may not align with what users actually want.
- Algorithms encode choices: What to optimize for, what data to use, and what tradeoffs to make are human decisions embedded in algorithmic systems.
- Algorithmic failures are predictable: Feedback loops, training bias, distribution shift, and proxy metric optimization are known failure modes, not surprises.
- Fairness is mathematically complex: Multiple legitimate definitions of fairness are mathematically incompatible — making "fair algorithms" a value choice, not a purely technical problem.
The algorithms shaping what you see, read, buy, and are approved for are built by people making specific choices. Understanding how they work is the first step toward evaluating whether those choices serve you.
References
- Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Proceedings of the 7th International Conference on World Wide Web, 107-117. https://doi.org/10.1016/S0169-7552(98)00110-X
- Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 81, 1-15.
- Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 5(2), 153-163. https://doi.org/10.1089/big.2016.0047
- Fuller, J., et al. (2021). Hidden Workers: Untapped Talent. Harvard Business School and Accenture.
- Gomez-Uribe, C. A., & Hunt, N. (2015). The Netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems, 6(4), 1-19. https://doi.org/10.1145/2843948
- Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37. https://doi.org/10.1109/MC.2009.263
- Meta. (2023). How Facebook's News Feed Algorithm Works. Meta Transparency Center. https://transparency.fb.com
- Nayak, P. (2019). Understanding Searches Better than Ever Before. Google Blog. https://blog.google/products/search/search-language-understanding-bert/
- Reuters. (2018, October 10). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/
- Stray, J., et al. (2021). What are you optimizing for? Aligning recommender systems with human values. Proceedings of the ICML 2020 Workshop on Challenges in Deploying and Monitoring Machine Learning Systems.
- Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, s2-42(1), 230-265. https://doi.org/10.1112/plms/s2-42.1.230
- Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269-271. https://doi.org/10.1007/BF01386390
- Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120-126. https://doi.org/10.1145/359340.359342
Frequently Asked Questions
What is an algorithm in simple terms?
An algorithm is a finite, step-by-step set of instructions for solving a problem or accomplishing a task. Every algorithm takes some input, processes it through a defined sequence of steps, and produces an output. Algorithms are not unique to computers — a recipe, a set of driving directions, or a checklist are all algorithms. What makes computer algorithms distinctive is that they must be precise enough for a machine to execute without ambiguity.
What is Big O notation and why does it matter?
Big O notation is a way of describing how the time or memory requirements of an algorithm scale as the input size grows. O(1) means constant time — the operation takes the same amount of time regardless of input size. O(n) means linear time — the operation scales proportionally with the input. O(n²) means quadratic time — doubling the input quadruples the time. Big O matters because an algorithm that works fine with 100 items may be unusably slow with 1 million items if it has poor scaling characteristics.
How do recommendation algorithms work?
Most recommendation algorithms use one of two approaches, often in combination. Collaborative filtering finds users with similar behavior to you and recommends what those users liked. Content-based filtering analyzes the attributes of items you have engaged with and finds similar items. Modern systems like those used by Netflix, Spotify, and YouTube combine these approaches with deep learning models that identify complex patterns in behavior data at a scale that no explicit rule system could replicate.
What is the difference between a sorting algorithm and a search algorithm?
A sorting algorithm rearranges a collection of items into a defined order (such as alphabetical or numerical). Common examples include quicksort, mergesort, and bubble sort. A search algorithm finds a specific item within a collection. The simplest search algorithm (linear search) checks items one by one. Binary search is more efficient — it repeatedly halves the search space by comparing the target to the middle value — but requires the data to already be sorted.
Are algorithms neutral?
No. Algorithms reflect the choices made by the people who design them — which outcomes to optimize for, which data to train on, and which tradeoffs to make. An algorithm that optimizes for user engagement may amplify emotionally provocative content. A hiring algorithm trained on historical promotion data may encode historical biases. Describing an algorithm as neutral because it is automated is a common misconception that has significant consequences when algorithms are used in consequential decisions like hiring, lending, or criminal justice.