Every physicist drowns in literature. Over 2 million papers are published annually across physics alone — more than any human can read, let alone synthesise. Large language models have fundamentally changed how researchers interact with this firehose of knowledge. This cluster covers every practical NLP tool in the physicist’s arsenal: semantic search, automated summarisation, RAG pipelines, equation extraction, fine-tuning on domain text, and LLM-assisted code generation. With working Python code throughout.
AI for Physics Students › Cluster 8 › Cluster 9: NLP & LLMs for Physics
Clusters 1–8 focused on ML tools that help physicists do physics — fitting data, solving equations, simulating systems, discovering laws. Cluster 9 focuses on something different: using AI to help you read, understand, and build on the physics that has already been done. The literature is the accumulated knowledge of the field. LLMs are becoming the interface to that knowledge.
- How Transformers Work (Physics Intuition)
- Semantic Search of arXiv with SPECTER
- Building a RAG Pipeline for Literature Review
- Automated Paper Summarisation
- LaTeX Equation Extraction & Classification
- Fine-Tuning LLMs on Physics Text
- LLM-Assisted Physics Coding
- Responsible Use: Hallucination & Trust
Section 1 — How Transformers Work: A Physicist’s Intuition
Before using LLMs as tools, it helps to understand what they are doing — at least at the level of physical intuition. The transformer architecture, introduced by Vaswani et al. (2017), processes sequences by computing attention: a learned measure of relevance between every pair of positions in the input.
Think of it this way. Each token (a word, subword, or symbol) has three representations: a Query (what this token is looking for), a Key (what this token advertises about itself), and a Value (what this token contributes if attended to). The attention score between two tokens is the dot product Q·K, normalised by √d_k to prevent vanishing gradients in high dimensions. The output is a weighted sum of Values, where weights are the softmax-normalised attention scores.
For a physicist, this is a non-local, learned Green’s function. In a classical field theory, G(x, x′) couples field values at different spacetime points. Attention couples token representations at different sequence positions. The “field” is the sequence of token embeddings; the “coupling” is learned from data rather than derived from physics. The parallel is remarkably close — and it explains why transformers can handle long-range dependencies that RNNs cannot.
Section 2 — Semantic Search of arXiv with Physics-Specific Embeddings
Standard keyword search of arXiv finds papers that contain your exact keywords. Semantic search finds papers that mean what you mean, even if they use different terminology. A query for “neural network potential energy surface” should also return papers about “machine learning interatomic potentials” and “deep learning force fields” — because they describe the same concept using different vocabulary. Keyword search fails here; semantic search succeeds.
The tool is sentence embeddings: a model that maps text to a dense vector in semantic space, where similar meanings cluster together. For physics, the state-of-the-art embedding model is SPECTER (Lo et al. 2020), trained on citation graphs from scientific papers — if paper A cites paper B, their embeddings should be similar. SPECTER understands domain vocabulary: “Hamiltonian” and “energy operator” will have similar vectors.
# pip install sentence-transformers arxiv faiss-cpu import arxiv from sentence_transformers import SentenceTransformer import numpy as np import faiss import json # ── Step 1: Download physics papers from arXiv ───────────────── client = arxiv.Client() search = arxiv.Search( query = 'cat:cond-mat OR cat:hep-ph OR cat:astro-ph', max_results = 5000, sort_by = arxiv.SortCriterion.SubmittedDate ) papers = [] for result in client.results(search): papers.append({ 'id': result.entry_id, 'title': result.title, 'abstract': result.summary, 'authors': [a.name for a in result.authors[:3]], 'date': str(result.published.date()), 'url': result.pdf_url }) print(f"Downloaded {len(papers)} papers") # ── Step 2: Embed abstracts with SPECTER (physics-trained model) ─ # allenai/specter2 is the latest version — better than SPECTER1 model = SentenceTransformer('allenai/specter2') texts = [f"{p['title']} {p['abstract']}" for p in papers] embeddings = model.encode( texts, batch_size = 64, show_progress_bar= True, normalize_embeddings = True # unit vectors for cosine similarity ) print(f"Embeddings shape: {embeddings.shape}") # [5000, 768] # ── Step 3: Build FAISS index for fast nearest-neighbour search ─ # FAISS: Facebook AI Similarity Search — handles millions of vectors d = embeddings.shape[1] # embedding dimension: 768 index = faiss.IndexFlatIP(d) # inner product (= cosine for unit vectors) index.add(embeddings.astype(np.float32)) print(f"FAISS index: {index.ntotal} vectors") # ── Step 4: Semantic search ────────────────────────────────────── def semantic_search(query, top_k=10): q_emb = model.encode([query], normalize_embeddings=True) scores, indices = index.search(q_emb.astype(np.float32), top_k) results = [] for score, idx in zip(scores[0], indices[0]): p = papers[idx] results.append({**p, 'similarity': float(score)}) return results # ── Example queries ────────────────────────────────────────────── queries = [ 'machine learning interatomic potentials molecular dynamics', 'transformer architecture attention quantum many-body systems', 'normalizing flows posterior sampling particle physics', ] for q in queries: results = semantic_search(q, top_k=5) print(f"\nQuery: {q}") for r in results: print(f" [{r['similarity']:.3f}] {r['title'][:70]}")
allenai/specter2) is the best general-purpose scientific paper embedding model — trained on 75M citation pairs from Semantic Scholar. For pure physics text, SciBERT (allenai/scibert_scivocab_uncased) is a BERT model pre-trained on 1.14M scientific papers. For cross-modal tasks involving equations, consider MathBERT or the recently released LLEMMA (math-focused LLM). Start with SPECTER2 for literature search. 💡 Model Choices for Physics Embeddings#faf5ffSection 3 — Building a RAG Pipeline for Physics Literature Review
Retrieval-Augmented Generation (RAG) is the most practical and reliable way to use LLMs for scientific literature review. The idea: instead of asking an LLM to answer from its training data (which may be outdated or hallucinated), you first retrieve relevant papers from your own database, then include them as context in the LLM prompt. The model answers based on retrieved documents, not from memory.
For physics research, this is transformative. You can build a private RAG system over your group’s preprints, your institution’s published work, a curated reading list, or the full arXiv corpus in your subfield. When you ask “what is the current experimental status of the muon g-2 anomaly?”, the system retrieves the five most relevant recent papers and synthesises them into a structured answer — with citations you can verify.
# pip install langchain langchain-community openai chromadb # RAG pipeline: retrieve relevant chunks, then generate with LLM from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_community.vectorstores import Chroma from langchain_community.embeddings import HuggingFaceEmbeddings from langchain.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser import arxiv, os # ── Step 1: Collect and chunk physics papers ──────────────────── splitter = RecursiveCharacterTextSplitter( chunk_size = 800, # tokens per chunk chunk_overlap = 100, # overlap to preserve context across chunks separators = ['\n\n', '\n', '. ', ' '] # respect paragraph structure ) documents = [] for paper in papers[:200]: # use our downloaded papers text = f"{paper['title']}\n\n{paper['abstract']}" chunks = splitter.create_documents( [text], metadatas=[{'title': paper['title'], 'url': paper['url'], 'date': paper['date']}] ) documents.extend(chunks) print(f"Total chunks: {len(documents)}") # ── Step 2: Embed and store in vector database (Chroma) ───────── # Chroma: lightweight, local vector DB — no server needed embedding_model = HuggingFaceEmbeddings( model_name = 'allenai/specter2', model_kwargs = {'device': 'cpu'}, encode_kwargs= {'normalize_embeddings': True} ) vectorstore = Chroma.from_documents( documents, embedding_model, persist_directory='./physics_rag_db' ) retriever = vectorstore.as_retriever(search_kwargs={'k': 5}) # ── Step 3: RAG chain with physics-tuned prompt ───────────────── PHYSICS_RAG_PROMPT = """You are a physics research assistant with deep domain expertise. Answer the question using ONLY the provided context from scientific papers. Always cite the paper titles when making specific claims. If the context does not contain enough information, say so explicitly. Do NOT speculate beyond what the papers say. Context from retrieved papers: {context} Question: {question} Answer (cite paper titles inline):""" prompt = ChatPromptTemplate.from_template(PHYSICS_RAG_PROMPT) # ── Step 4: Chain retrieval + generation ──────────────────────── # Using OpenAI API — replace with Anthropic/local model as preferred from langchain_openai import ChatOpenAI llm = ChatOpenAI(model='gpt-4o-mini', temperature=0.1) # low temp for factual answers def rag_answer(question): docs = retriever.invoke(question) context = '\n\n---\n\n'.join([d.page_content for d in docs]) sources = list(set([d.metadata['title'] for d in docs])) chain = prompt | llm | StrOutputParser() answer = chain.invoke({'context': context, 'question': question}) return answer, sources # Example questions a physicist might ask: questions = [ 'What ML methods are used for jet tagging at the LHC?', 'How do neural network potentials compare to DFT in accuracy?', 'What is the current state of gravitational wave detection with ML?', ] for q in questions: answer, sources = rag_answer(q) print(f'Q: {q}') print(f'A: {answer[:300]}...') print(f'Sources: {sources}\n')
Section 4 — Automated Paper Summarisation at Scale
Reading a paper fully takes 30–90 minutes. Skimming the abstract, introduction, and conclusions takes 5–10 minutes. An LLM can produce a structured summary in 10 seconds. For a physicist keeping up with a fast-moving subfield, this is a genuine productivity multiplier — provided you trust the summary enough to decide whether the full paper is worth reading, and you read the full paper before citing anything.
# Automated paper summarisation pipeline # Works with arXiv papers via their HTML/PDF, or any text import arxiv import requests from anthropic import Anthropic # ── Fetch full paper text from arXiv HTML endpoint ───────────── def fetch_arxiv_text(arxiv_id): """Fetch plain text of an arXiv paper via the HTML endpoint.""" url = f'https://ar5iv.labs.arxiv.org/html/{arxiv_id}' resp = requests.get(url, timeout=30) if resp.status_code != 200: return None from bs4 import BeautifulSoup soup = BeautifulSoup(resp.text, 'html.parser') # Extract main article text, skip references article = soup.find('article') if not article: return None # Remove reference section for refs in article.find_all(class_=['ltx_bibliography']): refs.decompose() return article.get_text(separator=' ', strip=True)[:12000] # ~3k tokens # ── Structured physics summary prompt ─────────────────────────── SUMMARY_PROMPT = '''You are a senior physicist. Summarise this paper concisely. Structure your summary exactly as follows: **Problem**: What specific problem does this paper address? **Method**: What ML/computational approach do they use? (2-3 sentences) **Key result**: What is the most important quantitative finding? **Significance**: Why does this matter for the field? **Limitations**: What does the paper not address or where might it fail? **Recommended for**: What type of physicist should read this in full? Paper text: {text} Summary:''' # ── Summarise a paper ──────────────────────────────────────────── client = Anthropic() def summarise_paper(arxiv_id): text = fetch_arxiv_text(arxiv_id) if not text: return 'Could not fetch paper text.' response = client.messages.create( model = 'claude-sonnet-4-20250514', max_tokens = 1000, messages = [{'role': 'user', 'content': SUMMARY_PROMPT.format(text=text)}] ) return response.content[0].text # ── Batch summarise a reading list ────────────────────────────── reading_list = [ '2310.06825', # GNoME materials paper '2112.09071', # autoencoder anomaly detection HEP '2203.07404', # causal PINNs ] for arxiv_id in reading_list: print(f'\n{'='*60}\narXiv:{arxiv_id}') print(summarise_paper(arxiv_id))
Section 5 — LaTeX Equation Extraction and Classification
Physics papers are unique in scientific literature: they are dense with mathematical expressions encoded as LaTeX. For many tasks — building equation databases, automatic knowledge graphs, training domain-specific models — you need to extract, parse, and classify equations from papers. This is a non-trivial NLP problem because equations are interspersed with natural language, span multiple lines, and can be nested arbitrarily.
# Extract and classify equations from LaTeX source files # arXiv provides source .tar.gz files for most papers import re, tarfile, io, requests from pathlib import Path # ── Download LaTeX source from arXiv ──────────────────────────── def get_arxiv_latex(arxiv_id): url = f'https://arxiv.org/src/{arxiv_id}' resp = requests.get(url, timeout=30) try: with tarfile.open(fileobj=io.BytesIO(resp.content)) as tar: for member in tar.getmembers(): if member.name.endswith('.tex'): f = tar.extractfile(member) if f: return f.read().decode('utf-8', errors='ignore') except: pass return None # ── Extract display equations ($$...$$, \[...\], equation env) ─ def extract_equations(latex_text): patterns = [ r'\\begin\{equation\*?\}(.*?)\\end\{equation\*?\}', r'\\begin\{align\*?\}(.*?)\\end\{align\*?\}', r'\\begin\{eqnarray\*?\}(.*?)\\end\{eqnarray\*?\}', r'\$\$(.+?)\$\$', r'\\\[(.*?)\\\]', ] equations = [] for pattern in patterns: matches = re.findall(pattern, latex_text, re.DOTALL) equations.extend([m.strip() for m in matches if len(m.strip()) > 5]) return list(set(equations)) # deduplicate # ── Classify equations by type using LLM ──────────────────────── from anthropic import Anthropic client = Anthropic() def classify_equation(eq_latex): prompt = f"""Classify this LaTeX equation into ONE of these categories: definition | conservation_law | equation_of_motion | loss_function | probability | wave_equation | thermodynamic | other Equation: {eq_latex[:200]} Respond with ONLY the category name, nothing else.""" resp = client.messages.create( model='claude-haiku-4-5-20251001', # fast + cheap for classification max_tokens=10, messages=[{'role':'user','content':prompt}] ) return resp.content[0].text.strip().lower() # ── Process a paper and build equation taxonomy ────────────────── latex = get_arxiv_latex('2101.03164') # NequIP paper if latex: eqs = extract_equations(latex) print(f"Found {len(eqs)} equations") taxonomy = {} for eq in eqs[:20]: # classify first 20 cat = classify_equation(eq) taxonomy.setdefault(cat, []).append(eq[:80]+'...') for cat, examples in taxonomy.items(): print(f"\n{cat.upper()} ({len(examples)} equations)") print(f" Example: {examples[0]}")
Section 6 — Fine-Tuning LLMs on Physics Text
General-purpose LLMs (GPT-4, Claude, Llama) are trained on broad internet text and have reasonable but imperfect physics knowledge. For specialised tasks — generating valid LaTeX equations, completing physics derivations, extracting structured data from specific paper formats — fine-tuning a smaller model on domain-specific data can significantly improve performance at much lower cost than using a large API-based model.
The standard approach is LoRA (Low-Rank Adaptation): instead of fine-tuning all model weights, you add small trainable rank-decomposition matrices to the attention layers. This reduces trainable parameters by 10,000×, making fine-tuning feasible on a single GPU. Combined with 4-bit quantisation (QLoRA), you can fine-tune a 7B or 13B parameter model on a single A100 GPU in hours.
# pip install transformers peft datasets bitsandbytes trl # QLoRA fine-tuning: 4-bit quantisation + LoRA # Hardware: single A100 (80GB) or 2x A6000 GPUs from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training from trl import SFTTrainer from datasets import load_dataset, Dataset import torch # ── Step 1: Prepare physics fine-tuning dataset ───────────────── # Format: instruction-following pairs from physics papers physics_examples = [ { 'instruction': 'Explain the physical meaning of the variational principle in quantum mechanics.', 'output': 'The variational principle states that for any trial wavefunction |psi>, the expectation valueprovides an upper bound on the true ground state energy E_0. This follows from expanding |psi> in the energy eigenbasis...' }, { 'instruction': 'Write the CGCNN message-passing update equation in LaTeX.', 'output': 'The CGCNN update is: \\mathbf{h}_i^{(l+1)} = \\mathbf{h}_i^{(l)} + \\sum_{j \\in \\mathcal{N}(i)} \\sigma\\left(\\mathbf{z}_{ij}^{(l)} \\mathbf{W}_g\\right) \\odot g\\left(\\mathbf{z}_{ij}^{(l)} \\mathbf{W}_f\\right)' }, # ... thousands more examples from papers and textbooks ] # Format for instruction fine-tuning def format_example(ex): return f"""### Instruction:\n{ex['instruction']}\n\n### Response:\n{ex['output']}<|endoftext|>""" dataset = Dataset.from_list([{'text': format_example(e)} for e in physics_examples]) # ── Step 2: Load base model with 4-bit quantisation (QLoRA) ───── bnb_config = BitsAndBytesConfig( load_in_4bit = True, bnb_4bit_quant_type = 'nf4', # NF4: best for LLM weights bnb_4bit_compute_dtype = torch.bfloat16, bnb_4bit_use_double_quant = True, # double quant for extra compression ) model_name = 'meta-llama/Llama-3.1-8B' model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config = bnb_config, device_map = 'auto', torch_dtype = torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer.pad_token = tokenizer.eos_token # ── Step 3: Configure LoRA ──────────────────────────────────────── # Only train rank-8 adapter matrices in attention layers # Trainable params: ~4M (vs 8B total model params) = 0.05% lora_config = LoraConfig( r = 16, # LoRA rank lora_alpha = 32, # scaling factor target_modules= ['q_proj', 'v_proj', 'k_proj', 'o_proj'], lora_dropout = 0.05, bias = 'none', task_type = 'CAUSAL_LM' ) model = prepare_model_for_kbit_training(model) model = get_peft_model(model, lora_config) model.print_trainable_parameters() # shows: ~4M / 8B (0.05%) # ── Step 4: Train with SFTTrainer ──────────────────────────────── trainer = SFTTrainer( model = model, train_dataset = dataset, dataset_text_field = 'text', max_seq_length = 2048, args = TrainingArguments( output_dir = './physics-llm', num_train_epochs = 3, per_device_train_batch_size = 4, gradient_accumulation_steps = 4, # effective batch = 16 learning_rate = 2e-4, fp16 = False, bf16 = True, logging_steps = 25, save_strategy = 'epoch', warmup_ratio = 0.03, lr_scheduler_type = 'cosine', ) ) trainer.train() model.save_pretrained('./physics-llm-lora')
Section 7 — LLM-Assisted Physics Coding
One of the highest-ROI applications of LLMs for physicists is code generation. Not replacing the physicist — but handling the boilerplate, suggesting implementations, catching bugs, and translating between mathematical formulations and code. A physicist who uses LLM-assisted coding effectively can implement in hours what used to take days.
The key is knowing how to prompt effectively for physics code. Vague prompts give vague results. Precise prompts that include the mathematical formulation, the expected input/output shapes, the physical constraints, and the test cases give production-quality code on the first attempt.
# Effective prompting patterns for physics code generation # The more physics context you provide, the better the output # ── Pattern 1: Equation-to-code with explicit context ────────── GOOD_PHYSICS_PROMPT = ''' Implement the following in Python using PyTorch: The Ornstein-Uhlenbeck (OU) process for a particle in a harmonic trap: dx = -gamma * x * dt + sigma * sqrt(dt) * N(0,1) where gamma is the restoring rate and sigma is the noise amplitude. Requirements: - Simulate N=1000 particles for T=200 time steps with dt=0.01 - gamma=1.0, sigma=0.5, initial positions drawn from N(0,1) - Return tensor of shape [T, N] containing all trajectories - Include the analytical stationary variance: Var_ss = sigma^2 / (2*gamma) - Verify numerically that the simulated variance matches the analytical value ''' # ── Pattern 2: Debug with physics context ─────────────────────── DEBUG_PROMPT = ''' This PINN is training but the physics residual loss stays > 0.1 even after 10,000 steps. Expected: < 0.001 for this ODE. The ODE is: du/dt = -2u, u(0) = 1 (exact solution: u = exp(-2t)) [PASTE CODE HERE] Diagnose the issue. Check: loss weights, collocation point density, learning rate, architecture depth, and activation function choice. Suggest specific fixes with physical justification. ''' # ── Pattern 3: Validate a physics implementation ──────────────── VALIDATE_PROMPT = ''' Review this Monte Carlo integration of the 2D Ising partition function. Check for: 1. Correct Boltzmann weight exp(-E/kT) in acceptance criterion 2. Proper periodic boundary conditions 3. Correct normalisation for energy per site 4. Any off-by-one errors in the spin update loop 5. Whether the magnetisation calculation is correct [PASTE CODE HERE] For each issue found, explain why it matters physically and provide a fix. ''' # ── Pattern 4: Translate maths to code ───────────────────────── TRANSLATE_PROMPT = ''' Convert this LaTeX equation to a numerically stable PyTorch implementation. Equation (from the paper): \\hat{A}_t = \\sum_{k=0}^{T-t} (\\gamma\\lambda)^k \\delta_{t+k} where delta_t = r_t + gamma * V(s_{t+1}) - V(s_t) (TD error) Requirements: - Input: rewards tensor [T], values tensor [T+1], gamma=0.99, lambda_gae=0.95 - Output: advantages tensor [T] - Must be computed in O(T) not O(T^2) — use the recursive form - Handle the terminal state correctly (V(s_T) = 0) - Include docstring explaining each variable's physical meaning ''' # These prompt patterns consistently outperform vague requests like: # 'implement ising model' or 'fix my PINN code' # The physics context tells the LLM what constraints matter
Section 8 — Responsible Use: Hallucination, Trust, and Scientific Integrity
The most important section in this cluster. LLMs are powerful tools, but using them irresponsibly in scientific research can damage your credibility, spread misinformation, and — in the worst case — produce published results that are wrong. This section lays out the principles that distinguish responsible from irresponsible use.
LLMs hallucinate plausible-sounding paper titles, authors, journal names, and page numbers. Even RAG systems can generate citations that slightly misrepresent the source. If you include a citation in your paper, you have read it. This is a non-negotiable principle of academic integrity — LLMs do not change it.
If an LLM says “this method achieves 0.95 AUC on the ATLAS dataset”, verify the number against the actual paper. LLMs interpolate between training examples and can produce plausible-but-wrong quantitative claims. Any specific number in scientific context must be verified at the source.
Use LLMs to get oriented in a new subfield quickly, understand how concepts relate, generate candidate papers to read, draft initial text for later careful revision, and debug code. These are legitimate, high-value uses that accelerate research without compromising integrity.
Most journals and conferences now have policies on LLM disclosure. If you used an LLM to help draft text, generate code, or process data, state this explicitly in your methods section. The standard is converging toward: disclose use, take full responsibility for accuracy, and do not list LLMs as authors.
LLM-generated physics code can be syntactically correct but physically wrong — wrong sign conventions, missing normalisations, incorrect boundary conditions. Always verify generated code against known analytic results before trusting it for research purposes. The prompting patterns in Section 7 are designed to produce verifiable code precisely for this reason.
External References & Further Reading
- Vaswani et al. (2017) — Attention Is All You Need. NeurIPS. arXiv:1706.03762 — The transformer paper. Still the most important ML paper of the 2010s.
- Lo et al. (2020) — SPECTER: Document-level Representation Learning using Citation-informed Transformers. ACL. arXiv:2004.07180 — The scientific paper embedding model.
- Beltagy et al. (2019) — SciBERT: A Pretrained Language Model for Scientific Text. EMNLP. arXiv:1903.10676
- Azerbayev et al. (2023) — LLEMMA: An Open Language Model for Mathematics. arXiv:2310.10631 — LLM pre-trained on mathematical text and code.
- Lewis et al. (2020) — Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS. arXiv:2005.11401 — The original RAG paper.
- Hu et al. (2022) — LoRA: Low-Rank Adaptation of Large Language Models. ICLR. arXiv:2106.09685
- Semantic Scholar API — api.semanticscholar.org — Free API for 200M+ scientific papers with SPECTER embeddings pre-computed.
- Transformers are learned non-local correlation functions. Attention couples every token to every other token via Q·K inner products. Understanding this helps you write better prompts: specific vocabulary activates the right attention patterns.
- SPECTER2 is your embedding model for physics. Pre-trained on citation graphs, it understands domain vocabulary. Build your semantic search index with it, stored in FAISS for fast nearest-neighbour retrieval.
- RAG beats asking from memory for factual questions. Retrieve 5 relevant paper chunks, include them in context, generate the answer. The model answers from your documents, not from hallucinated training data.
- Equation-specific prompts beat vague ones by a large margin. Include: the mathematical formulation, expected input/output shapes, physical constraints, test cases. These four elements make the difference between useful and mediocre generated code.
- QLoRA makes fine-tuning accessible. 4-bit quantisation + LoRA (rank 16) trains 0.05% of parameters. A 7B model fine-tuned on physics text outperforms GPT-4 on domain-specific tasks at 100× lower inference cost.
- Verify everything. Never cite unread papers. Always check numerical claims at the source. Disclose LLM use in your methods. The speed gain from LLMs is real; the responsibility for accuracy is still yours.

Pingback: Building Your AI Physics Career: Roadmap & Resources