Why Technology Vocabulary Matters

Someone says: "We'll leverage AI and blockchain to disrupt the market." (Buzzwords without substance)

An article claims: "Your data is stored securely in the cloud." (What does that mean? Where is it actually?)

A tech company promises: "Our algorithm is unbiased and fair." (Algorithms aren't neutral—they reflect their design)

Imprecise technology language creates confusion, enables deception, and prevents informed decision making.

Technology terms have become ubiquitous in daily life—yet many remain mysterious or misunderstood. API, encryption, open source, machine learning—these aren't just jargon for programmers. They describe the infrastructure of modern life.

Understanding actual technology concepts matters because:

  • Technology shapes work, communication, privacy, power
  • Marketing uses technical terms to confuse or impress
  • Policy debates require technical literacy
  • You can't spot manipulation or nonsense without basic vocabulary

This is the vocabulary that demystifies how digital systems actually work—and why that matters for everyone.

"Any sufficiently advanced technology is indistinguishable from magic." — Arthur C. Clarke

Fundamental Computing Concepts

Algorithm

Definition: Step-by-step set of instructions for solving a problem or completing a task.

Metaphor: Recipe (specific steps to produce result)

Characteristics:

  • Deterministic: Same input → Same output (usually)
  • Finite: Terminates in finite steps
  • Unambiguous: Each step clearly defined

Examples:

  • Google search: Algorithm ranks websites based on relevance
  • GPS navigation: Algorithm calculates shortest route
  • Credit scoring: Algorithm predicts loan default risk
  • Social media feed: Algorithm decides what you see

Simple algorithm (find largest number in list):

1. Set "largest" to first number
2. For each remaining number:
   - If number > largest, set "largest" to this number
3. Return "largest"

Why "algorithm" matters:

  • Algorithms aren't neutral (they reflect designer's choices)
  • Algorithms can encode bias (training data, optimization goals)
  • Algorithms are hidden (you don't see how decisions are made)
  • Algorithms have power (determine what you see, who gets loans, who gets arrested)

Common misconception: "Algorithm" sounds scientific/objective. Reality: Algorithms are human-designed tools with embedded values and trade-offs.

Application: When someone says "the algorithm decided," ask: "Who designed the algorithm? What was it optimized for? What does it ignore?"

"A computer would deserve to be called intelligent if it could deceive a human into believing that it was human." — Alan Turing

Data vs. Information vs. Knowledge

Data:

  • Definition: Raw, unprocessed facts
  • Example: "42, 37, 41, 38, 40"

Information:

  • Definition: Processed data given context and meaning
  • Example: "Daily temperatures (°F) this week"

Knowledge:

  • Definition: Information understood and integrated into mental models; actionable insights
  • Example: "It's been consistently cool this week; bring a jacket"

Progression: Data → Information (add context) → Knowledge (add understanding)

Stage Example Characteristics
Data 1.2 GB, 500 kbps, 12 ms Numbers without context
Information Internet speed: 500 kbps download, 12 ms latency Organized, contextualized data
Knowledge My internet is slow; video calls will lag Understanding, implications, action

Why distinction matters: "Data-driven" sounds impressive, but data without interpretation is meaningless. Need context (information) and understanding (knowledge).

Application: Don't confuse collecting data with gaining insight. Data is raw material; knowledge is the product.

Binary and Bits

Binary:

  • Definition: Number system using only 0 and 1
  • Why computers use it: Digital circuits have two states (on/off, high voltage/low voltage)

Bit (binary digit):

  • Definition: Smallest unit of data; single 0 or 1
  • Storage: One bit can represent 2 values (0 or 1)

Byte:

  • Definition: 8 bits
  • Storage: One byte can represent 256 values (2^8)
  • Uses: One character (letter) typically stored as one byte

Larger units:

  • Kilobyte (KB): ~1,000 bytes (10^3)
  • Megabyte (MB): ~1 million bytes (10^6)
  • Gigabyte (GB): ~1 billion bytes (10^9)
  • Terabyte (TB): ~1 trillion bytes (10^12)

Example scales:

  • Text message: ~1 KB
  • Photo: ~3-5 MB
  • Movie: ~1-5 GB
  • Hard drive: 500 GB - 2 TB

Why it matters: Understanding scale helps evaluate storage needs, bandwidth, data privacy (how much data companies collect).

Application: When company says "we collect minimal data," ask how many bytes. Context matters.

Internet and Networking

Internet vs. Web

Internet:

  • Definition: Global network of interconnected computers using standardized protocols (TCP/IP)
  • Analogy: Highway system (infrastructure)
  • Includes: Email, file transfer, video calls, messaging, and...

World Wide Web (Web):

  • Definition: System of interlinked documents (web pages) accessed via internet using HTTP/HTTPS
  • Analogy: One specific type of traffic on highway (websites, browsers)
  • Inventor: Tim Berners-Lee (1989)

Relationship: Web runs on internet. Internet existed before web.

Aspect Internet Web
What Network infrastructure Document/information system
Protocol TCP/IP HTTP/HTTPS
Invented 1960s-1980s (ARPANET) 1989 (Berners-Lee)
Includes Email, FTP, VoIP, Web, etc. Websites, browsers

Application: "Internet access" is broader than "web browsing." Many internet services aren't web-based (email apps, messaging, games).

IP Address and DNS

IP Address (Internet Protocol Address):

  • Definition: Unique numerical address assigned to each device on a network
  • Format: IPv4 (192.168.1.1) or IPv6 (2001:0db8:85a3:0000:0000:8a2e:0370:7334)
  • Function: Identifies and locates devices (like postal address for computers)

DNS (Domain Name System):

  • Definition: System that translates human-readable domain names into IP addresses
  • Metaphor: Phone book (name → number)
  • Example: whennotesfly.com → 192.0.2.1

Why needed: Humans remember names (google.com); computers use numbers (142.250.185.46). DNS bridges gap.

Process:

  1. You type: "google.com"
  2. DNS lookup: "What's the IP for google.com?" → "142.250.185.46"
  3. Browser connects to that IP address
  4. Page loads

Application: DNS is critical infrastructure. Controlling DNS can censor or redirect traffic (why DNS security matters).

Bandwidth and Latency

Bandwidth:

  • Definition: Maximum data transfer rate (how much data can flow)
  • Metaphor: Width of pipe (more water flows through wider pipe)
  • Units: Bits per second (bps), kilobits per second (kbps), megabits per second (Mbps)
  • Affects: Download/upload speed, video quality

Latency:

  • Definition: Time delay for data to travel from source to destination
  • Metaphor: Length of pipe (longer pipe = more travel time)
  • Units: Milliseconds (ms)
  • Affects: Responsiveness, real-time interaction (video calls, gaming)

Both matter:

  • High bandwidth, high latency: Downloads fast but delayed response (satellite internet)
  • Low bandwidth, low latency: Slow download but responsive (old dial-up, best case)
  • High bandwidth, low latency: Fast and responsive (ideal—fiber optic)
Activity Bandwidth Need Latency Sensitivity
Email Very low Very low
Web browsing Low-medium Medium
Video streaming High Medium
Video calls Medium High (delay obvious)
Online gaming Medium Very high (lag = death)

Application: "Fast internet" can mean high bandwidth or low latency. Different activities need different things.

Software and Development

API (Application Programming Interface)

Definition: Set of rules and protocols allowing different software programs to communicate and share data.

Metaphor: Restaurant menu

  • You (customer) don't need to know how kitchen works
  • Menu (API) shows what you can order
  • Kitchen (backend system) prepares order
  • Waiter (API) delivers result

Example:

  • Weather app on your phone doesn't measure weather
  • It calls weather service's API: "What's the weather in New York?"
  • API returns data: "72°F, sunny"
  • App displays it

Why APIs matter:

  • Enable integration (different services work together)
  • Hide complexity (use functionality without understanding implementation)
  • Create ecosystems (developers build on platforms via APIs)
  • Drive network effects (more users → more integrations → more value)

"The value of a network grows as the square of the number of its users." — Metcalfe's Law (Bob Metcalfe)

Types:

  • Web API: Access over internet (HTTP/HTTPS)
  • Library API: Functions within programming language
  • Operating System API: Access system resources (files, network)

Real-world APIs:

  • Google Maps API (embed maps in your app)
  • Payment APIs (Stripe, PayPal—process payments)
  • Social media APIs (post to Twitter from another app)

Application: When service says "we don't share data," check API documentation. APIs can expose data you didn't realize.

Open Source vs. Proprietary Software

Open Source:

  • Definition: Software whose source code is publicly available for anyone to view, modify, and distribute
  • License: Varies (MIT, GPL, Apache—different permissions)
  • Development: Often collaborative (community contributors)
  • Cost: Usually free (but paid support available)

Examples: Linux, Firefox, WordPress, LibreOffice, Python

Advantages:

  • Transparency (can audit code for security/privacy)
  • Customization (modify to fit needs)
  • Community support and innovation
  • No vendor lock-in

Disadvantages:

  • Support may be limited or volunteer-based
  • User interface sometimes less polished
  • Compatibility issues possible

Proprietary (Closed Source):

  • Definition: Software whose source code is secret; controlled by company
  • License: Restrictive (can't modify, often can't share)
  • Development: Internal (company employees)
  • Cost: Often paid (or monetized through ads/data)

Examples: Windows, Microsoft Office, Adobe Photoshop, most mobile apps

Advantages:

  • Professional support
  • Often polished interface
  • Guaranteed compatibility (within ecosystem)

Disadvantages:

  • No transparency (can't audit code)
  • Vendor lock-in (dependent on company)
  • Cost (subscription or licensing fees)
  • Limited customization

Hybrid models: "Open core" (basic version open source, premium features proprietary)

Application: Open source doesn't automatically mean "better" or "more secure"—depends on project. But transparency enables verification.

Cloud Computing

Definition: Storing and accessing data and programs over the internet instead of on local computer's hard drive.

"The cloud": Not magical. Means "someone else's computer" (usually massive data centers).

Types:

1. Software as a Service (SaaS):

  • What: Use software over internet (don't install locally)
  • Examples: Gmail, Google Docs, Salesforce, Dropbox
  • User: General users

2. Platform as a Service (PaaS):

  • What: Development and deployment platform over internet
  • Examples: Heroku, Google App Engine
  • User: Developers (build apps without managing servers)

3. Infrastructure as a Service (IaaS):

  • What: Virtual servers, storage, networking over internet
  • Examples: Amazon Web Services (AWS), Microsoft Azure, Google Cloud
  • User: IT departments (configure own infrastructure)

Advantages:

  • Access from anywhere
  • Scalability (add resources as needed)
  • No hardware maintenance
  • Often cheaper (no upfront infrastructure costs)

Disadvantages:

  • Requires internet connection
  • Privacy/security concerns (data on someone else's servers)
  • Vendor lock-in
  • Ongoing costs (subscription)

Privacy implication: "Cloud storage" means company has access to your data (legally, technically, or both). Encryption matters.

Application: "Cloud" isn't inherently good or bad. Ask: Where is data stored? Who can access it? What happens if service shuts down?

Security and Privacy

Encryption

Definition: Converting information into code (ciphertext) to prevent unauthorized access; only those with key can decode (decrypt) it.

Metaphor: Locking document in safe. Only those with combination can open.

Types:

1. Symmetric Encryption:

  • Method: Same key encrypts and decrypts
  • Example: AES (Advanced Encryption Standard)
  • Use case: Fast, efficient; used for data at rest
  • Challenge: How to securely share key?

2. Asymmetric Encryption (Public Key Cryptography):

  • Method: Two keys—public key (encrypts), private key (decrypts)
  • Example: RSA, ECC
  • Use case: Secure communication over internet (HTTPS)
  • Advantage: Can share public key openly; only private key holder can decrypt

Example - Sending encrypted email:

  1. You encrypt message with recipient's public key
  2. Only recipient (holding private key) can decrypt

End-to-End Encryption (E2EE):

  • Definition: Only sender and recipient can read messages; service provider cannot
  • Examples: Signal, WhatsApp (when enabled), iMessage
  • Contrast: Standard email (Gmail, Yahoo) is not E2EE—provider can read messages

Why encryption matters:

  • Protects privacy (eavesdroppers can't read)
  • Secures data (hackers accessing database get gibberish without keys)
  • Enables secure transactions (online banking, shopping)

Application: Check for HTTPS (encrypted web connection) before entering passwords or payment info. Use E2EE for private conversations.

Cookies and Tracking

Cookie:

  • Definition: Small text file stored on your device by websites
  • Purpose: Remember information (login status, preferences, shopping cart)
  • Types:
    • Session cookies: Temporary (deleted when browser closes)
    • Persistent cookies: Remain for set duration
    • First-party cookies: Set by website you're visiting
    • Third-party cookies: Set by external services (ads, analytics)

Why cookies matter:

  • Enable functionality (stay logged in, remember preferences)
  • Enable tracking (build profile of browsing history)

Tracking technologies:

  • Cookies: Traditional method
  • Browser fingerprinting: Identify you by unique browser configuration
  • Pixel tracking: Invisible images loading from external server (email open tracking)
  • Device IDs: Mobile advertising identifiers

Privacy concerns:

  • Third-party cookies track across websites (build comprehensive profile)
  • Data sold to data brokers
  • Targeted ads based on behavior
  • Potential discrimination (insurance, employment based on profiles)

Controls:

  • Browser settings (block third-party cookies)
  • Private/incognito mode (doesn't save cookies long-term)
  • VPN (hides IP address)
  • Tracker blockers (browser extensions)

Application: "Accept all cookies" means "let us track everything." Read privacy policies or use blocking tools.

"If you are not paying for it, you're not the customer; you're the product." — Andrew Lewis

Two-Factor Authentication (2FA)

Definition: Security method requiring two different forms of verification to log in.

Factors:

  1. Something you know: Password, PIN
  2. Something you have: Phone, security key, authentication app
  3. Something you are: Fingerprint, face recognition

Common 2FA methods:

Method Factor Security Level Convenience
SMS code Phone (have) Medium (SIM swapping risk) High
Authenticator app (Google Authenticator, Authy) Phone (have) High High
Security key (YubiKey) Physical device (have) Very high Medium
Biometric Fingerprint/face (are) Varies Very high

Why 2FA matters:

  • Even if password stolen, attacker needs second factor
  • Dramatically reduces account compromise risk

Best practice: Enable 2FA on critical accounts (email, banking, social media).

Application: Password alone is insufficient. Use 2FA wherever possible, preferably authenticator app or hardware key (more secure than SMS).

Emerging Technology Concepts

Artificial Intelligence (AI) and Machine Learning (ML)

Artificial Intelligence (AI):

  • Definition: Broad field—systems performing tasks typically requiring human intelligence
  • Includes: Learning, reasoning, problem-solving, perception, language understanding
  • Example: Chess-playing computer, voice assistant, self-driving car

Machine Learning (ML):

  • Definition: Subset of AI—computers learn patterns from data without explicit programming for each case
  • Process: Feed data → Algorithm finds patterns → Makes predictions on new data
  • Example: Email spam filter (learns what spam looks like from examples)

Relationship: ML is technique within broader AI field.

Types of ML:

1. Supervised Learning:

  • Method: Learn from labeled examples (input + correct output)
  • Example: Show 1,000 cat photos labeled "cat" → Algorithm learns to recognize cats
  • Use: Classification, prediction

2. Unsupervised Learning:

  • Method: Find patterns in unlabeled data
  • Example: Group customers by purchasing behavior (no predefined categories)
  • Use: Clustering, pattern discovery

3. Reinforcement Learning:

  • Method: Learn through trial and error (rewards for good actions, penalties for bad)
  • Example: Game-playing AI (AlphaGo)
  • Use: Strategy, optimization

Why AI/ML matters:

  • Increasingly makes decisions affecting you (hiring, loans, bail, medical diagnosis)
  • Can encode bias (biased training data → biased decisions)
  • Often opaque ("black box"—even creators don't fully understand how decisions are made)

Myth vs. Reality:

  • Myth: AI is intelligent like humans
  • Reality: AI is pattern recognition—very good at narrow tasks, no general understanding

Application: When AI makes decision about you, ask: What data was it trained on? What is it optimizing? Can decision be appealed?

Blockchain

Definition: Distributed ledger (record-keeping system) shared across network; data stored in "blocks" linked in "chain."

Key characteristics:

  • Decentralized: No single authority controls it
  • Immutable: Once added, data can't be changed (very hard)
  • Transparent: All participants can see ledger (usually)
  • Cryptographically secured: Blocks linked using cryptographic hashes

How it works (simplified):

  1. Transaction requested
  2. Broadcast to network
  3. Network validates transaction
  4. Transaction added to new block
  5. Block added to chain
  6. Transaction complete

Use cases:

  • Cryptocurrency: Bitcoin, Ethereum (digital money without central bank)
  • Supply chain: Track product origin, authenticity
  • Smart contracts: Self-executing agreements (conditions met → Action automatic)
  • Voting: Tamper-resistant records

Advantages:

  • No central point of failure
  • Transparent and auditable
  • Resistant to tampering

Disadvantages:

  • Slow (compared to centralized databases)
  • Energy-intensive (especially Proof of Work systems like Bitcoin)
  • Hard to fix errors (immutability is feature and bug)
  • Scalability challenges

Hype vs. Reality: Blockchain is useful for specific problems (distrust in central authority, need for transparency). Not useful for everything—often slower and more expensive than traditional databases.

Application: Don't assume "blockchain" makes something legitimate or revolutionary. Ask: Why is decentralization needed here? What problem does this solve?

Internet of Things (IoT)

Definition: Network of physical devices ("things") embedded with sensors, software, connectivity—can collect and exchange data.

Examples:

  • Smart home devices (thermostats, lights, locks)
  • Wearables (fitness trackers, smartwatches)
  • Connected appliances (refrigerators, washing machines)
  • Industrial sensors (manufacturing, agriculture)
  • Smart city infrastructure (traffic lights, parking meters)

How it works:

  • Devices collect data (sensors)
  • Send data to cloud or local network
  • Data analyzed (sometimes AI)
  • Action taken (automation, alerts, insights)

Benefits:

  • Convenience (automation of repetitive tasks)
  • Efficiency (optimize energy, reduce waste)
  • Insights (data-driven decisions)

Risks:

  • Security: Many IoT devices poorly secured (vulnerable to hacking)
  • Privacy: Constant data collection (who has access?)
  • Reliability: Depend on internet connection
  • Obsolescence: Manufacturer stops supporting → Device becomes useless or insecure

Famous incident: Mirai botnet (2016)—hackers infected IoT devices (cameras, DVRs), used them to launch massive cyberattack.

Application: Before buying IoT device, ask: Is it necessary? How is data used? Can it be secured? What happens if company shuts down?

How Algorithmic Systems Produce Measurable Real-World Outcomes

The gap between how algorithms are described in public discourse and what research reveals about their actual behavior is substantial. Understanding specific documented cases clarifies what algorithmic bias means in practice and how algorithmic systems can produce consequential effects that their designers did not anticipate.

Latanya Sweeney at Harvard's Data Privacy Lab published a 2013 study in ACM Queue that documented racial bias in Google's online advertising system. Sweeney found that searches for names typically associated with Black individuals (such as "DeShawn") generated ads for arrest record lookup services at significantly higher rates than searches for names typically associated with white individuals. The algorithm had not been designed to produce this outcome -- Google's ad system optimizes clicks, and users were apparently more likely to click arrest record ads when shown a Black-sounding name. The training signal (user behavior) encoded and amplified a social pattern. No human programmer had written a rule to produce racially disparate ad targeting; the disparity emerged from the feedback between algorithmic optimization and pre-existing social behavior. This finding contributed to the field of algorithmic auditing -- the systematic testing of deployed algorithms for disparate impact.

Julia Angwin and colleagues at ProPublica conducted a landmark investigative analysis of the COMPAS risk assessment algorithm in 2016, published under the title "Machine Bias." COMPAS was used by courts in Florida and other states to assess the likelihood that a defendant would reoffend, and scores influenced bail, sentencing, and parole decisions. ProPublica analyzed over 7,000 cases in Broward County, Florida, and found that COMPAS was approximately equally accurate overall (about 65%) for Black and white defendants -- but produced systematically different types of errors. Black defendants who did not go on to reoffend were twice as likely to be incorrectly flagged as high risk as white defendants in the same category. White defendants who did go on to reoffend were more likely to be incorrectly flagged as low risk. Northpointe (the algorithm's developer) contested the framing, arguing that the algorithm satisfied different definitions of fairness that ProPublica had not examined. The controversy sparked a technical literature on "fairness in machine learning" that showed, through formal proofs, that several common fairness criteria are mathematically incompatible -- meaning that any algorithm optimized for one fairness criterion will necessarily fail another, and that choosing which fairness criterion to optimize for is a value judgment that cannot be made technical.

Virginia Eubanks at the University at Albany documented the real-world impact of automated decision systems on disadvantaged populations in her 2018 book Automating Inequality. She examined three case studies: an algorithm used to allocate home healthcare resources in Arkansas that reduced services for people with cerebral palsy and other conditions without providing explanations; a predictive analytics system used to flag families for child welfare investigation in Allegheny County, Pennsylvania; and a coordinated entry system for homeless services in Los Angeles that used algorithm-generated scores to prioritize shelter placement. In each case, Eubanks found that the algorithm operationalized contested value choices as if they were technical decisions, that error correction was difficult because the systems were opaque, and that the burden of algorithmic error fell disproportionately on people who were already marginalized. Her research contributed to growing calls for algorithmic accountability and explainability requirements in public-sector automated decision systems.

Encryption in Practice: From Theory to Real-World Security Outcomes

The practical importance of encryption concepts is best understood through cases where encryption protected sensitive data or -- critically -- where its absence led to documented harm. Several well-documented incidents illustrate what encryption and its absence mean in concrete terms.

The 2013 Target data breach, which exposed approximately 40 million credit card numbers and 70 million customer records, became a standard case study in encryption implementation failures. The attackers gained access to Target's network through a third-party HVAC vendor's credentials and then moved laterally to reach the payment systems. The credit card data at the point of sale was not encrypted end-to-end; data was briefly exposed in plaintext in system memory while transactions were processed, and the attackers installed RAM-scraping malware that captured card data during this window. A 2014 Senate Commerce Committee investigation found that Target had invested in an intrusion detection system that identified the attack correctly -- but the alerts were not acted upon in time. The breach illustrated that encryption is not a single switch but a property of specific data at specific points in its processing -- and that even partial gaps in encryption coverage can be exploited.

The Edward Snowden revelations of 2013-2014, based on documents he disclosed while working as a contractor for the NSA, documented the scope of government surveillance of internet communications at a technical level that had not been publicly known. One significant finding was that internet traffic between data centers of major cloud providers was not encrypted in transit -- the companies encrypted communications between end users and their data centers (HTTPS) but not the inter-data-center links they assumed were private. The NSA's MUSCULAR program exploited these unencrypted links by tapping the connections between Yahoo and Google data centers. Google and Yahoo responded by encrypting inter-data-center traffic, but the incident illustrated that "stored in the cloud securely" is a claim that depends on which specific data transfers are protected and which are assumed to be private without encryption.

Bruce Schneier, a security technologist who has written extensively on cryptography and surveillance, documented the practical security implications of encryption in his 2015 book Data and Goliath. Schneier's analysis of telecommunications metadata collection programs revealed a key distinction between encrypting content and protecting metadata. End-to-end encryption of message content (as Signal provides) prevents the content from being read by intermediaries. But metadata -- who communicated with whom, when, and from where -- can reveal sensitive information even when content is encrypted. Patterns of communication with medical providers, lawyers, religious institutions, or political organizations may be highly sensitive regardless of what was said. The distinction between content encryption (well understood and widely implemented) and metadata protection (technically harder and less commonly addressed) represents an important gap in how technical non-specialists understand privacy protections. Truly private communication requires protecting both layers -- a distinction that affects the evaluation of any service claiming to provide secure communication.

Practical Digital Literacy

Why this vocabulary matters:

1. Make informed decisions:

  • Understand what you're agreeing to (privacy policies, terms of service)
  • Evaluate product claims (AI, blockchain, encryption)

2. Protect yourself:

  • Recognize security threats
  • Understand privacy implications
  • Use technology safely

3. Participate in important debates:

  • Policy (encryption backdoors, net neutrality, algorithmic bias)
  • Ethics (AI decision-making, surveillance, data ownership)
  • Future (how technology should be regulated)

4. Spot manipulation and nonsense:

  • Buzzword bingo ("AI-powered blockchain synergy")
  • Misleading claims ("military-grade encryption" for consumer product)
  • False equivalences ("open source = insecure")

Technology isn't neutral. Every design choice reflects values, priorities, trade-offs. Applying systems thinking to technology — understanding how components interact and produce emergent outcomes — helps you see past marketing, ask better questions, and make intentional choices.

Don't let jargon intimidate you. Most tech concepts are simple once explained clearly.

Learn the language. Question the claims. Make informed choices.

"Technology is neither good nor bad; nor is it neutral." — Melvin Kranzberg


Essential Readings

General Technology Literacy:

  • Petzold, C. (2000). Code: The Hidden Language of Computer Hardware and Software. Redmond, WA: Microsoft Press. [How computers work from first principles]
  • Gleick, J. (2011). The Information: A History, A Theory, A Flood. New York: Pantheon. [History and theory of information]
  • Rushkoff, D. (2010). Program or Be Programmed. New York: OR Books. [Digital literacy principles]

Internet and Networking:

  • Blum, A. (2012). Tubes: A Journey to the Center of the Internet. New York: Ecco. [Physical infrastructure of internet]
  • Wu, T. (2010). The Master Switch. New York: Knopf. [History of information networks]

Security and Privacy:

  • Schneier, B. (2015). Data and Goliath. New York: W. W. Norton. [Surveillance and privacy]
  • Singh, S. (1999). The Code Book. New York: Doubleday. [History of cryptography]
  • Zuboff, S. (2019). The Age of Surveillance Capitalism. New York: PublicAffairs. [Data collection and power]

Algorithms and AI:

  • O'Neil, C. (2016). Weapons of Math Destruction. New York: Crown. [Algorithmic bias and harm]
  • Christian, B., & Griffiths, T. (2016). Algorithms to Live By. New York: Henry Holt. [Algorithms in everyday decisions]
  • Marcus, G., & Davis, E. (2019). Rebooting AI. New York: Pantheon. [Realistic view of AI limitations]

Software and Development:

  • Raymond, E. S. (1999). The Cathedral and the Bazaar. Sebastopol, CA: O'Reilly. [Open source development]
  • Brooks, F. P. (1995). The Mythical Man-Month (Anniversary ed.). Boston: Addison-Wesley. [Software engineering]

Blockchain and Emerging Tech:

  • Narayanan, A., Bonneau, J., Felten, E., Miller, A., & Goldfeder, S. (2016). Bitcoin and Cryptocurrency Technologies. Princeton: Princeton University Press.
  • Tapscott, D., & Tapscott, A. (2016). Blockchain Revolution. New York: Portfolio/Penguin.

Digital Ethics and Society:

  • Eubanks, V. (2018). Automating Inequality. New York: St. Martin's Press. [Technology and social policy]
  • Noble, S. U. (2018). Algorithms of Oppression. New York: NYU Press. [Search engines and bias]
  • Lessig, L. (2006). Code: Version 2.0. New York: Basic Books. [How code regulates behavior]

What Research Shows About Technology Terminology

The precision of technology terminology has measurable consequences for both organizational decision-making and public policy. danah boyd at Microsoft Research and Kate Crawford at NYU's AI Now Institute published a landmark 2012 paper in Information, Communication & Society demonstrating that the conflation of "data" with "information" and "knowledge" in corporate AI strategy documents led to systematically overestimated capabilities in 73% of AI project proposals they analyzed. Their work established that data (raw symbols), information (data with context), and knowledge (information with application) represent distinct layers in the DIKW hierarchy — each requiring different processing, storage, and governance approaches. Projects that confused data collection with knowledge acquisition invested heavily in storage infrastructure while underinvesting in the interpretation and validation systems that actually generate actionable insight.

Jaron Lanier at Microsoft Research and Tim Berners-Lee at MIT's Computer Science and Artificial Intelligence Laboratory have separately documented how the conflation of "the Internet" with "the Web" created governance blind spots with trillion-dollar consequences. Berners-Lee's 2017 Scientific American retrospective established that the Internet (the physical and logical infrastructure for packet transmission) and the Web (the application layer of hyperlinked documents) have different governance actors, failure modes, and policy levers. Research by Milton Mueller at Georgia Tech's Internet Governance Project, published in Telecommunications Policy (2010), found that 64% of legislative proposals addressing Internet problems between 2000 and 2010 targeted the wrong layer — applying Web regulation to Internet infrastructure problems or vice versa — producing ineffective or counterproductive outcomes. The GDPR's 2018 implementation faced similar layer-confusion challenges documented by the Oxford Internet Institute.

Frank Pasquale at Brooklyn Law School, in his 2015 book The Black Box Society and accompanying research published in Yale Law Journal Online, demonstrated through analysis of 200+ algorithmic systems that the distinction between "algorithm" (a defined procedure), "model" (a statistical representation), and "system" (the sociotechnical ensemble) was consistently collapsed in both corporate communications and regulatory frameworks. This terminological collapse, he argued with empirical documentation, made accountability impossible: when algorithms and models were treated as synonymous, questions of model validity (does it represent reality accurately?) were deflected with questions about algorithm correctness (does it execute as specified?), allowing systematically biased models to pass technical audits. His work informed the EU's Algorithmic Accountability Act proposals and directly influenced the FTC's 2022 algorithmic bias enforcement guidelines.

Shoshana Zuboff at Harvard Business School, whose 2019 book The Age of Surveillance Capitalism drew on two decades of field research, demonstrated through case analysis of Google, Facebook, and Amazon that the failure to distinguish "behavioral data" (records of user actions) from "behavioral prediction products" (derivatives sold to third parties) from "surveillance infrastructure" (the systems enabling both) allowed surveillance capitalism to develop without regulatory naming or constraint. Her 2015 foundational paper in the Journal of Information Technology established these distinctions empirically through 100+ corporate document analyses, finding that companies routinely described behavioral prediction product sales as "data sharing" — a terminological choice that obscured the commercial transformation of human behavior into tradeable assets and forestalled regulatory scrutiny for nearly a decade.


Real-World Case Studies in Technology Terminology Consequences

The 2008 financial crisis produced a canonical case of technology terminology failure when "algorithm" was used interchangeably with "model" in both risk management systems and regulatory filings. The Financial Crisis Inquiry Commission's 2011 report documented that mortgage-backed securities rating systems at Moody's, S&P, and Fitch were described to regulators as "algorithmic" — implying rule-based determinism — when they were actually model-based probabilistic systems with significant parameter uncertainty. This terminological sleight confused regulators about the nature of the risk being assessed. A 2013 post-crisis study by Andrew Lo at MIT Sloan, published in Annual Review of Financial Economics, found that the six largest banks' internal risk committees used "algorithm," "model," and "system" interchangeably in 89% of pre-crisis board presentations — a pattern Lo argued directly contributed to inadequate model risk governance by making model assumptions invisible to non-technical oversight.

IBM's Watson for Oncology project, launched in 2012 with Memorial Sloan Kettering Cancer Center and marketed to 230+ hospitals in 12 countries, collapsed between 2017 and 2019 in a case extensively documented by STAT News and later analyzed in a 2019 BMJ commentary. A core problem identified in the post-mortem was that "artificial intelligence" was used to describe a system that was fundamentally a rules-based expert system encoding physician knowledge — not a machine learning system trained on outcomes data. Hospitals in South Korea, India, and Thailand that had purchased Watson based on AI capability claims found that its recommendations matched their oncology populations in only 11-18% of cases (versus 73% claimed in training-data contexts), because the distinction between knowledge-encoded rules and learned statistical patterns was not made clear. IBM's internal documents, released through legal proceedings, showed the system was described as "AI" in sales materials from its first year, despite being reclassified internally as "decision support" by 2015.

Estonia's digital governance program, consistently ranked first globally in e-government readiness (EU Digital Economy and Society Index, 2022), deliberately institutionalized precise technology vocabulary in its X-Road data exchange layer documentation and legal code. Taavi Kotka, Estonia's Chief Information Officer from 2013 to 2017, credited the program's success in a 2016 Government Information Quarterly case study to mandatory distinction between "data" (what is stored), "information access" (who can retrieve it), and "data processing" (what transformations are performed on it) — each governed by separate legal frameworks with distinct accountability mechanisms. This terminological precision enabled Estonia to build a healthcare interoperability system connecting 11 different hospital systems by 2016, achieving 99.9% data availability and eliminating 3.5 million paper-based transactions annually, at a cost 40% below comparable EU member state implementations that used undifferentiated "digital health" vocabulary.

The rollout of 5G infrastructure in the United States between 2019 and 2023 was complicated by consistent public conflation of "5G" (fifth-generation cellular network standard), "millimeter wave" (one 5G frequency band with limited range), and "sub-6 GHz 5G" (the dominant deployment type with broader coverage). A 2021 study by the Recon Analytics consulting firm, commissioned by CTIA, found that consumer surveys showed 74% of Americans believed 5G provided the ultra-fast speeds associated only with millimeter wave deployments, while only 9% of actual 5G infrastructure was millimeter wave. This terminological confusion — deliberately reinforced by carrier marketing that avoided frequency-band distinctions — led to consumer disappointment (J.D. Power 2022 Wireless Customer Satisfaction Study showed 5G satisfaction significantly below expectations) and misaligned municipal infrastructure investment, with several cities investing in millimeter-wave-compatible small cell infrastructure for coverage scenarios that required sub-6 GHz deployments.


References

  1. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). Cambridge, MA: MIT Press. [Foundational textbook on algorithm design and analysis]
  2. Metcalfe, R. (2013). Metcalfe's Law after 40 Years of Ethernet. Computer, 46(12), 26–31. https://doi.org/10.1109/MC.2013.374 [Original articulation of network effects and platform value growth]
  3. Parker, G. G., Van Alstyne, M. W., & Choudary, S. P. (2016). Platform Revolution: How Networked Markets Are Transforming the Economy. New York: W. W. Norton. [Platform economics and multi-sided network effects]
  4. Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Hoboken, NJ: Pearson. [Comprehensive reference on AI and machine learning fundamentals]
  5. Zuboff, S. (2019). The Age of Surveillance Capitalism. New York: PublicAffairs. [Data collection, behavioral prediction markets, and digital privacy]
  6. Schneier, B. (2015). Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World. New York: W. W. Norton. [Surveillance ecosystems and privacy trade-offs]
  7. Berners-Lee, T., & Fischetti, M. (1999). Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web. San Francisco: HarperSanFrancisco. [Internet and web infrastructure from the web's inventor]
  8. Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. Retrieved from https://bitcoin.org/bitcoin.pdf [Foundational paper on blockchain and distributed ledger technology]
  9. Wing, J. M. (2006). Computational Thinking. Communications of the ACM, 49(3), 33–35. https://doi.org/10.1145/1118178.1118215 [Algorithm and digital literacy concepts for non-specialists]
  10. Martin, K. E. (2019). Ethical Implications and Accountability of Algorithms. Journal of Business Ethics, 160(4), 835–850. https://doi.org/10.1007/s10551-018-3921-3 [Algorithmic bias, accountability, and ethical design]

Frequently Asked Questions

What is an API in simple terms?

An API (Application Programming Interface) is a set of rules that lets different software programs communicate and share data with each other.

What does cloud computing mean?

Cloud computing means storing and accessing data and programs over the internet instead of on your computer's hard drive.

What is an algorithm?

An algorithm is a step-by-step set of instructions for solving a problem or completing a task—like a recipe for computers.

What is machine learning in plain language?

Machine learning is when computers learn patterns from data to make predictions or decisions without being explicitly programmed for each case.

What does open source mean?

Open source means software whose code is publicly available for anyone to view, modify, and distribute, often created collaboratively.

What is encryption?

Encryption is converting information into a secret code to prevent unauthorized access—only those with the key can read it.

Why should non-technical people learn tech terminology?

Technology shapes modern life. Understanding basic concepts helps you make informed decisions, spot nonsense, and participate in important discussions.