History of Hashtags: Roman to AI

Close up of ancient Roman stone inscription with carved letters showing scribal abbreviations — Roman scribal abbreviations, including ligature for libra pondo, laid the groundwork for what became the # symbol nearly two millennia later.

On August 23, 2007, product consultant Chris Messina typed a single message on Twitter: “How do you feel about using # (pound) for groups. As in #barcamp [msg]?” That short question changed how people label, search, and organize public conversation online. By 2026, social platforms collectively host over 500 million hashtags each month, according to WorldMetrics 2026 sourced report on hashtag statistics. The symbol Messina borrowed from Internet Relay Chat had already survived a long technical journey, from Roman scribes abbreviating weights to typewriter keyboards, telephone keypads, ASCII, social media, and machine analysis of cultural patterns.

Roman Origins: From Libra Pondo to Typographic Ligature

The # symbol traces its lineage to the Roman term libra pondo, meaning “pound weight.” The abbreviation lb was often written as a joined mark rather than two isolated letters. Over time, scribes used the ligature ℔, combining L and b with a horizontal stroke across the top. That stroke told the reader the written form was an abbreviation and should be expanded mentally.

The Wikipedia entry on the number sign describes the long shift from the ℔ ligature toward the modern cross-hatched form. The line across the abbreviation became visually entangled with the slanted strokes beneath it. In handwriting, repeated use simplified the mark. The result was a sign that no longer looked like the letters L and b, even though its origin remained tied to pound weight.

By the time Joseph Moxon published Mechanick Exercises in 1683, the # shaped mark was already used in printing practice as a correction mark. Printers placed it in margins to signal that space should be inserted between words. That use moved the symbol from private shorthand into a documented typographic workflow.

Close up of vintage typewriter keyboard showing number sign key — Mechanical keyboards gave the # symbol a fixed place in everyday text entry long before it became a social media tag.

Mechanical Adoption: Typewriters, Telephones, and ASCII

The move from handwriting to machines changed the symbol’s future. Once a mark earns a place on a keyboard, it becomes cheap to reproduce. The Remington Standard typewriter around 1886 helped bring the sign into mechanical use. The Blickensderfer model 5 typewriter, circa 1896, referred to it as “number mark” in its instruction manual.

Early twentieth-century American sources called it “number sign.” A 1903 shorthand textbook used “pound or number sign” and explained its positional meanings. The phrase “pound sign” entered U.S. usage by 1932, based on historical dictionary citations summarized in the number sign reference. The most important mechanical adoption came through teleprinters and character sets. When the symbol entered early teleprinter codes and then ASCII, it became available across computer systems instead of remaining a local office convention.

Bell Telephone Laboratories placed the sign on the bottom-right button of touch-tone telephone keypads in 1968. The key became more familiar in the early 1980s as voicemail and PBX systems used it as a command key. The name “octothorp” also emerged at Bell Labs, where engineers needed a formal term for documentation. One account links “octo” to the symbol’s eight free stroke ends and “thorp” to athlete Jim Thorpe. The first appearance of “octothorp” in a U.S. patent came in 1973.

Stage	Approximate date	Main function	Why it mattered	Source
Libra pondo ligature	Roman and later manuscript use	Abbreviation for pound weight	Created visual ancestor of later # sign	Number sign reference
Moxon’s printing mark	1683	Printer correction mark for spacing	Moved form into documented print practice	Number sign reference
Remington Standard typewriter	Around 1886	Mechanical keyboard character	Made mark easy to type in office work	Number sign reference
Bell touch-tone keypad	1968	Telephone command key	Put sign in front of millions of non-technical users	Number sign reference
IRC channels	Around 1988	Network-wide channel prefix	Created direct convention later reused by Twitter users	Hashtag reference
Twitter hashtag proposal	August 23, 2007	Public topic grouping	Turned mark into user-created metadata layer	Hashtag reference

Digital Tagging: IRC Channels and the Birth of the Hashtag

Before social platforms turned # into a cultural signal, computing had already assigned it several technical jobs. The C programming language used # for preprocessor directives. PDP-11 assembly language used it to denote immediate address mode. Those uses did not create the social tag, but they kept the symbol visible to programmers and early network users.

The direct ancestor of the social hashtag was Internet Relay Chat. Around 1988, IRC used # to prefix channels and topics available across the network. Local channels used ampersand (&) instead. The convention was simple: put a visible marker before a shared topic, and people can gather around it without waiting for a central editor to create a category.

Chris Messina was an open-source advocate and IRC user. On August 23, 2007, he suggested using # on Twitter for groups. Twitter did not immediately build the feature as a formal product. Users adopted it first. The hashtag reference notes that Messina did not try to patent the idea and described hashtags as being “born of the internet, and owned by no one.”

The convention spread during the October 2007 San Diego forest fires, when users tagged posts with #sandiegofire to coordinate information. Stowe Boyd used the term “hash tag” in a blog post on August 26, 2007, three days after Messina’s original message. The tag format gained wider international attention during the 2009-2010 Iranian election protests, where English and Persian tags helped users share information across borders.

Twitter began hyperlinking hashtags to search results on July 2, 2009. In 2010, the platform introduced Trending Topics, which surfaced tags gaining rapid attention. By June 2014, the word “hashtag” had entered the Oxford English Dictionary.

Person holding smartphone with social media feed showing trending hashtags — Hashtags turned the # symbol into a user-controlled routing layer for public conversation.

Global Names: Pound, Hash, Octothorp, Sharp, and Square

The Unicode Consortium designates U+0023 as “number sign,” but daily usage depends on region, industry, and device context. In North America, many people still call it the pound sign because of telephone prompts and weight notation. In the UK, Australia, and South Africa, “hash” is more common. In social media contexts, many users call the symbol itself a hashtag, even though hashtag strictly means the symbol plus the tag string.

Name	Common region or context	Typical use	Source
Number sign	Canada and northeastern United States	Official Unicode name and numeric labeling	Number sign reference
Pound sign	United States and Canada	Telephone keypad prompts and weight notation	Number sign reference
Hash	United Kingdom, Australia, and South Africa	General speech and programming compounds such as hash bang	Number sign reference
Hashtag	Social media	Metadata tag prefix and searchable public topic marker	Hashtag reference
Hex	Singapore and Malaysia	Telephone menus and apartment addressing	Number sign reference
Octothorp	Bell Labs and technical documentation	Formal engineering name for keypad symbol	Number sign reference
Sharp	Music and programming	Resemblance to musical sharp sign and pronunciation of C#	Number sign reference
Square	ITU-T E.161 telephone keypad terminology	Formal keypad naming in standard	Number sign reference

“Sharp” comes from resemblance, not identity. The musical sharp sign is U+266F, while # is U+0023. The two signs are visually close enough that C# is pronounced “C Sharp,” but the ECMA-334 specification writes the language name using the number sign after the letter C. That distinction matters in fonts, identifiers, search, and accessibility tooling.

AI Analysis: How Hashtag Networks Become Cultural Data

By 2026, the # symbol has moved far beyond weight notation and social categorization. According to WorldMetrics 2026 sourced report, social media platforms collectively host over 500 million hashtags each month. The same report states that Instagram posts with hashtags receive 2.3 times more likes than posts without them, TikTok posts with 10 to 15 hashtags have three times higher chance of trending, and hashtag campaigns deliver 19% higher return on investment than campaigns without them. These numbers should be treated as platform-dependent operating signals, not universal laws.

The deeper change in 2026 is analytical. Researchers now treat hashtags as observable traces of public attention. A tag can be a topic label, campaign slogan, joke, product category, location marker, or group identity signal. When tags appear together repeatedly, they form networks. Those networks can be measured over time to see which associations remain stable and which ones appear briefly during news cycles.

Complexity Digest reported on April 29, 2026 that researchers studying temporal hashtag networks used ensemble clustering to distinguish stable cultural modules from transient ones in social photo-sharing data. The article, “Exploring Cultural Evolution Through Modular Dynamics in Temporal Hashtag Networks”, describes work using four years of data from major photo-sharing platforms. The key idea is practical: when same clusters keep reappearing across time windows and perturbed network samples, they are less likely to be statistical noise.

For engineers, the intuition is similar to monitoring service dependencies. A one-time spike in traffic between two services may be an incident, test, or bot. A repeated pattern across weeks is more likely to describe a real dependency. Hashtag analysis applies the same logic to culture. A stable cluster around #MeToo, #BlackLivesMatter, or #ClimateCrisis means the tag is functioning as more than a search term. It is helping route attention, identity, and action. For teams managing content strategy, understanding these patterns is similar to how organizations handle Mac fleet management in 2026: Apple Business Manager vs. third-party MDM for 30-50 devices, where repeated operational patterns guide tooling decisions.

The WorldMetrics report states that #MeToo generated over 17 million tweets across 85 countries, #BlackLivesMatter has been credited with influencing policy changes in 12 countries, and #ClimateCrisis appears in 70% of global climate advocacy content. Those examples show why hashtag networks matter to researchers and communications teams. Large tags can become durable anchors. Short-lived tags can still matter during emergencies, protests, product launches, or misinformation events.

Machine analysis helps because humans cannot manually inspect hundreds of millions of monthly tags at useful speed. The trade-off is context. A model can detect co-occurrence and cluster stability, but it can miss irony, reclaimed language, coordinated manipulation, or local meaning. A cultural analyst still needs sample-level review, language knowledge, and platform context before turning a graph into a decision. This is similar to the trade-offs in Unreal Engine 6 2026: balancing graphics fidelity and hardware costs in game development, where automated benchmarks must be paired with human judgment.

Production Code: Building a Hashtag Co-occurrence Graph

A production team analyzing hashtags usually starts with a simple pipeline: collect public posts under platform rules, normalize tags, build time-windowed co-occurrence edges, and export the graph for clustering. The example below uses only Python’s standard library so it can run in a locked-down environment without adding dependencies. It reads a CSV of posts, extracts hashtags, creates weighted edges between tags that appear in the same post, and writes an edge list suitable for downstream graph analysis.

Note: The following code is an illustrative example and has not been verified against official documentation. Please refer to the official docs for production-ready code.

#!/usr/bin/env python3
"""
Build hashtag co-occurrence edge list from exported social posts.

Input CSV columns expected:
 post_id, created_at, platform, text

Output CSV columns:
 window_start, tag_a, tag_b, weight

prod notes:
 - Add platform-specific API compliance checks before collection.
 - Add language detection if you compare tags across regions.
 - Add bot and spam filtering before graph construction.
 - Add retention limits for personal data and raw post text.
"""

import csv
import re
from collections import Counter
from datetime import datetime, timezone, timedelta
from itertools import combinations
from pathlib import Path

HASHTAG_RE = re.compile(r"(?<!#)#(\w+)", re.UNICODE)

def normalize_tag(tag: str) -> str:
 """Lowercase and strip common noise from a tag string."""
 return tag.lower().strip("#")

def parse_iso_timestamp(ts: str) -> datetime:
 """Parse ISO 8601 timestamp string to timezone-aware datetime."""
 return datetime.fromisoformat(ts)

def build_cooccurrence_graph(
 csv_path: Path,
 window_hours: int = 24,
) -> list[tuple[str, str, str, int]]:
 """
 Build co-occurrence edges from CSV of posts.

 Returns list of (window_start, tag_a, tag_b, weight) tuples.
 """
 window_edges: dict[tuple[str, str, str], Counter] = {}
 window_delta = timedelta(hours=window_hours)

 with open(csv_path, "r", encoding="utf-8") as f:
 reader = csv.DictReader(f)
 for row in reader:
 try:
 post_ts = parse_iso_timestamp(row["created_at"])
 post_text = row["text"]
 except (KeyError, ValueError) as exc:
 print(f"Skipping row: {exc}")
 continue

 raw_tags = HASHTAG_RE.findall(post_text)
 tags = sorted(set(normalize_tag(t) for t in raw_tags if t))

 if len(tags) < 2:
 continue

 window_start = post_ts.replace(
 hour=0, minute=0, second=0, microsecond=0
 ).isoformat()

 for combo in combinations(tags, 2):
 key = (window_start, combo[0], combo[1])
 if key not in window_edges:
 window_edges[key] = Counter()
 window_edges[key]["weight"] += 1

 results = []
 for (window_start, tag_a, tag_b), counts in window_edges.items():
 results.append((window_start, tag_a, tag_b, counts["weight"]))

 results.sort(key=lambda x: (x[0], x[3], x[1], x[2]))
 return results

def write_edge_list(
 edges: list[tuple[str, str, str, int]],
 output_path: Path,
) -> None:
 """Write co-occurrence edges to CSV."""
 with open(output_path, "w", newline="", encoding="utf-8") as f:
 writer = csv.writer(f)
 writer.writerow(["window_start", "tag_a", "tag_b", "weight"])
 writer.writerows(edges)

if __name__ == "__main__":
 import sys

 if len(sys.argv) < 2:
 print("Usage: python hashtag_graph.py <input_csv> [output_csv]")
 sys.exit(1)

 input_csv = Path(sys.argv[1])
 output_csv = Path(sys.argv[2]) if len(sys.argv) > 2 else Path("cooccurrence.csv")

 edges = build_cooccurrence_graph(input_csv)
 write_edge_list(edges, output_csv)
 print(f"Wrote {len(edges)} edges to {output_csv}")

For production use, the pipeline needs platform-specific API compliance checks before collection. Language detection helps when comparing tags across regions. Bot and spam filtering should run before graph construction. Retention limits for personal data and raw post text must be set.

A practical next step after building the edge list is to use a graph library such as NetworkX for clustering. The build_cooccurrence_graph function returns edges grouped by time window, which allows for temporal analysis. Managers can run this weekly and compare cluster stability across windows to decide which tags represent durable themes versus passing trends. For teams handling large-scale data pipelines, understanding quantization in practice: GGUF Q-levels vs AWQ vs GPTQ vs FP8 (2026) can inform how to compress feature vectors for downstream clustering at scale.

Key Takeaways

The # symbol originated from the Roman libra pondo abbreviation for pound weight and evolved through the ℔ ligature into the modern cross-hatched form.
Mechanical adoption on typewriters, telephone keypads, teleprinter codes, and ASCII made the symbol easy to reproduce before social media existed.
IRC channel naming directly influenced Chris Messina’s August 23, 2007 proposal to use # for groups on Twitter.
By 2026, hashtag use is large enough for network analysis, with WorldMetrics reporting over 500 million hashtags across social platforms each month.
AI-assisted clustering can identify durable cultural modules in hashtag networks, but teams still need human review for irony, manipulation, local meaning, and platform-specific context.

Sources: Wikipedia: Number Sign, Wikipedia: Hashtag, WorldMetrics Hashtag Statistics 2026, Complexity Digest: Cultural Evolution in Hashtag Networks.

Close up of vintage typewriter keyboard showing the number sign key — Mechanical keyboards gave the # symbol a fixed place in everyday text entry long before it became a social media tag.

More in-depth coverage from this blog on closely related topics:

Sources and References

Sources cited while researching and writing this article: