The breakthroughs in AI today aren’t happening in research labs. They happen at 2 AM, when production systems fail, on-call engineers scramble, and decisions needThe breakthroughs in AI today aren’t happening in research labs. They happen at 2 AM, when production systems fail, on-call engineers scramble, and decisions need

Engineering the Future: Sai Sreenivas Kodur on Scaling AI Systems That Think, Learn, and Operate at Enterprise Scale

The breakthroughs in AI today aren’t happening in research labs. They happen at 2 AM, when production systems fail, on-call engineers scramble, and decisions need to be made in milliseconds.

Sai Sreenivas Kodur has spent the last decade in those moments. From high-scale search infrastructure to voice analytics platforms and a pioneering AI company for the food and beverage industry, Kodur has worked at the sharp edge of what it means to build AI systems that not only work but endure.

From Systems Research to Scalable Reality

Kodur’s engineering mindset was forged at IIT Madras, where his graduate research blended machine learning with compiler optimization algorithms to improve performance across heterogeneous computing environments.

“The real value wasn’t just the technical depth,” he says. “It was learning how to design systems that solve real constraints across architecture, data, and performance.”

That systems-first framing, treating ML not as magic but as part of a larger machine, became a recurring pattern in his career.

It wasn’t long before he’d be putting those ideas to the test, in production.

Making AI Work in Production

At Myntra and later at Zomato, Kodur led teams that built search and recommendation systems for millions of users. Traffic surged. Catalogs are updated in real time. The margin for error was thin.

“At that scale, it’s not just about a better prediction, it’s about infrastructure,” he explains. “Caching, freshness, indexing logic, these aren’t backend concerns. They are the product experience.”

In one case, a latency misalignment between the model and the cache caused expired items to appear in user feeds. A tiny detail, but in e-commerce, tiny details cost millions.

“That’s when it clicked for me. Scaling AI isn’t about scaling models. It’s about designing the systems around them.”

Serving the Enterprise: Reliability as a Feature

Kodur’s next chapter took him deeper into the enterprise. At Observe.AI, as Director of Engineering, he led platform, analytics, and product engineering just as the company began onboarding major enterprise clients.

Suddenly, the rules changed. Uptime wasn’t a feature; it was a contract. Compliance, observability, and auditability weren’t nice-to-haves; they were essentials. They were table stakes.

“We couldn’t just add features. We had to re-architect the platform to deserve trust,” he says.

The work paid off: his team introduced data observability layers that slashed operational tickets by 60%, redesigned infra to support 10x growth, and supported $15M+ in ARR from new enterprise customers, including Uber, DoorDash, and Swiggy.

“Enterprise AI doesn’t scale by brute force. It scales through clarity. Every layer from the API to the database has to carry the weight.”

Building Spoonshot: A Vertical Intelligence Stack

While at Observe.AI, Kodur also began to see the limitations of general-purpose AI. In sectors like food and beverage, where regulation, science, and sensory data drive decisions, off-the-shelf tools fall short.

So he co-founded Spoonshot, an AI company purpose-built for food innovation.

“We weren’t just analyzing data. We were building a brain for food,” he says.

Spoonshot’s core engine, Foodbrain, ingested over 100TB of alternative data from 30,000+ sources. It mapped ingredients to sensory trends, regulatory data, flavor compounds, and consumer insights, surfacing opportunities that human R&D teams often missed.

“One client spotted an emerging spike in ‘umami’ trends months before it hit retail. That kind of signal isn’t in your sales data, and it’s buried in food science and niche blogs.”

The platform, Genesis, became a trusted tool for companies like Coca-Cola, Heinz, and Pepsico to develop new products faster and with greater confidence.

“Domain-aware AI isn’t just ‘smarter.’ It’s more respectful. It understands the user’s world, not just their data.”

Research That Fixes Real Problems

Kodur’s contributions to AI don’t end at products. He’s also published practical research grounded in day-to-day engineering pain.

His 2025 paper on Debugmate, an AI agent for on-call triaging, tackled a universal developer nightmare: late-night outages and complex system failures.

“Ask any engineer what they dread. It’s not bad code; it’s the moment you’re alone with a vague alert and 10 dashboards. Debugmate was our answer.”

By correlating observability signals, internal system knowledge, and historical tickets, the agent reduced incident load by 77%. Not a theoretical operational relief.

“We weren’t trying to ‘do research.’ We were solving a problem we lived through.”

That ethos practitioner-first, problem-led is a hallmark of Kodur’s approach to AI systems.

Building an AI-Native Organization

In a recent three-part blog series, Kodur mapped out his thinking on what comes next: not just using AI to build software, but reorganizing teams and operating procedures on how software itself gets built with AI in the loop as both builder and operator.

“The old stack was built for human workflows. But today, assistants like Claude and Devin are not just writing code, they’re taking the role of pilots while human engineers are merely co-pilots.

The challenge? Infrastructure hasn’t caught up.

“AI is now a user of your systems and a maintainer. The abstractions need to change.”

In his view, the AI-native organization needs:

  • Self-observing platforms that diagnose and heal themselves
  • Developer velocity abstractions that work with generated code
  • Governance that assumes iteration is constant, not occasional

“Reliability won’t come from checklists. It will come from how the system is born.”

You can read the whole blog series at aiworldorder.xyz.

What’s Next: Compounding Machines

Looking ahead, Kodur believes that platform engineering will define the next decade of AI, not just as a post facto function, but as the backbone of systems that evolve autonomously.

“We’re not just shipping software anymore. We’re building compounding machines,” he says. “Every model you deploy trains another. Every insight feeds the next. If the platform can’t keep up, the whole thing collapses.”

His vision? A world where infrastructure is self-managing, where AI agents operate systems with accountability, and where every line of code moves us closer to scalable, resilient, domain-aware intelligence.

Final Thought: The Blueprint for AI Engineers

Image by DC Studio on Freepik

If you’re an engineering leader wondering how to architect systems for this new reality where AI isn’t a feature but a participant, Sai Sreenivas Kodur’s journey is more than a biography.

It’s a playbook.

Build for change, not control. Assume the AI is watching. And design your systems like they’ll be inherited by an agent with no context but full access.

Welcome to the AI-native era. Are your systems ready?

Want more stories like this? Explore AI Journ’s archive for practitioner-driven insights on building reliable, scalable, AI-first platforms.

Market Opportunity
FUTURECOIN Logo
FUTURECOIN Price(FUTURE)
$0.08603
$0.08603$0.08603
+0.61%
USD
FUTURECOIN (FUTURE) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

21Shares Launches JitoSOL Staking ETP on Euronext for European Investors

21Shares Launches JitoSOL Staking ETP on Euronext for European Investors

21Shares launches JitoSOL staking ETP on Euronext, offering European investors regulated access to Solana staking rewards with additional yield opportunities.Read
Share
Coinstats2026/01/30 12:53
Digital Asset Infrastructure Firm Talos Raises $45M, Valuation Hits $1.5 Billion

Digital Asset Infrastructure Firm Talos Raises $45M, Valuation Hits $1.5 Billion

Robinhood, Sony and trading firms back Series B extension as institutional crypto trading platform expands into traditional asset tokenization
Share
Blockhead2026/01/30 13:30
Summarize Any Stock’s Earnings Call in Seconds Using FMP API

Summarize Any Stock’s Earnings Call in Seconds Using FMP API

Turn lengthy earnings call transcripts into one-page insights using the Financial Modeling Prep APIPhoto by Bich Tran Earnings calls are packed with insights. They tell you how a company performed, what management expects in the future, and what analysts are worried about. The challenge is that these transcripts often stretch across dozens of pages, making it tough to separate the key takeaways from the noise. With the right tools, you don’t need to spend hours reading every line. By combining the Financial Modeling Prep (FMP) API with Groq’s lightning-fast LLMs, you can transform any earnings call into a concise summary in seconds. The FMP API provides reliable access to complete transcripts, while Groq handles the heavy lifting of distilling them into clear, actionable highlights. In this article, we’ll build a Python workflow that brings these two together. You’ll see how to fetch transcripts for any stock, prepare the text, and instantly generate a one-page summary. Whether you’re tracking Apple, NVIDIA, or your favorite growth stock, the process works the same — fast, accurate, and ready whenever you are. Fetching Earnings Transcripts with FMP API The first step is to pull the raw transcript data. FMP makes this simple with dedicated endpoints for earnings calls. If you want the latest transcripts across the market, you can use the stable endpoint /stable/earning-call-transcript-latest. For a specific stock, the v3 endpoint lets you request transcripts by symbol, quarter, and year using the pattern: https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={q}&year={y}&apikey=YOUR_API_KEY here’s how you can fetch NVIDIA’s transcript for a given quarter: import requestsAPI_KEY = "your_api_key"symbol = "NVDA"quarter = 2year = 2024url = f"https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={quarter}&year={year}&apikey={API_KEY}"response = requests.get(url)data = response.json()# Inspect the keysprint(data.keys())# Access transcript contentif "content" in data[0]: transcript_text = data[0]["content"] print(transcript_text[:500]) # preview first 500 characters The response typically includes details like the company symbol, quarter, year, and the full transcript text. If you aren’t sure which quarter to query, the “latest transcripts” endpoint is the quickest way to always stay up to date. Cleaning and Preparing Transcript Data Raw transcripts from the API often include long paragraphs, speaker tags, and formatting artifacts. Before sending them to an LLM, it helps to organize the text into a cleaner structure. Most transcripts follow a pattern: prepared remarks from executives first, followed by a Q&A session with analysts. Separating these sections gives better control when prompting the model. In Python, you can parse the transcript and strip out unnecessary characters. A simple way is to split by markers such as “Operator” or “Question-and-Answer.” Once separated, you can create two blocks — Prepared Remarks and Q&A — that will later be summarized independently. This ensures the model handles each section within context and avoids missing important details. Here’s a small example of how you might start preparing the data: import re# Example: using the transcript_text we fetched earliertext = transcript_text# Remove extra spaces and line breaksclean_text = re.sub(r'\s+', ' ', text).strip()# Split sections (this is a heuristic; real-world transcripts vary slightly)if "Question-and-Answer" in clean_text: prepared, qna = clean_text.split("Question-and-Answer", 1)else: prepared, qna = clean_text, ""print("Prepared Remarks Preview:\n", prepared[:500])print("\nQ&A Preview:\n", qna[:500]) With the transcript cleaned and divided, you’re ready to feed it into Groq’s LLM. Chunking may be necessary if the text is very long. A good approach is to break it into segments of a few thousand tokens, summarize each part, and then merge the summaries in a final pass. Summarizing with Groq LLM Now that the transcript is clean and split into Prepared Remarks and Q&A, we’ll use Groq to generate a crisp one-pager. The idea is simple: summarize each section separately (for focus and accuracy), then synthesize a final brief. Prompt design (concise and factual) Use a short, repeatable template that pushes for neutral, investor-ready language: You are an equity research analyst. Summarize the following earnings call sectionfor {symbol} ({quarter} {year}). Be factual and concise.Return:1) TL;DR (3–5 bullets)2) Results vs. guidance (what improved/worsened)3) Forward outlook (specific statements)4) Risks / watch-outs5) Q&A takeaways (if present)Text:<<<{section_text}>>> Python: calling Groq and getting a clean summary Groq provides an OpenAI-compatible API. Set your GROQ_API_KEY and pick a fast, high-quality model (e.g., a Llama-3.1 70B variant). We’ll write a helper to summarize any text block, then run it for both sections and merge. import osimport textwrapimport requestsGROQ_API_KEY = os.environ.get("GROQ_API_KEY") or "your_groq_api_key"GROQ_BASE_URL = "https://api.groq.com/openai/v1" # OpenAI-compatibleMODEL = "llama-3.1-70b" # choose your preferred Groq modeldef call_groq(prompt, temperature=0.2, max_tokens=1200): url = f"{GROQ_BASE_URL}/chat/completions" headers = { "Authorization": f"Bearer {GROQ_API_KEY}", "Content-Type": "application/json", } payload = { "model": MODEL, "messages": [ {"role": "system", "content": "You are a precise, neutral equity research analyst."}, {"role": "user", "content": prompt}, ], "temperature": temperature, "max_tokens": max_tokens, } r = requests.post(url, headers=headers, json=payload, timeout=60) r.raise_for_status() return r.json()["choices"][0]["message"]["content"].strip()def build_prompt(section_text, symbol, quarter, year): template = """ You are an equity research analyst. Summarize the following earnings call section for {symbol} ({quarter} {year}). Be factual and concise. Return: 1) TL;DR (3–5 bullets) 2) Results vs. guidance (what improved/worsened) 3) Forward outlook (specific statements) 4) Risks / watch-outs 5) Q&A takeaways (if present) Text: <<< {section_text} >>> """ return textwrap.dedent(template).format( symbol=symbol, quarter=quarter, year=year, section_text=section_text )def summarize_section(section_text, symbol="NVDA", quarter="Q2", year="2024"): if not section_text or section_text.strip() == "": return "(No content found for this section.)" prompt = build_prompt(section_text, symbol, quarter, year) return call_groq(prompt)# Example usage with the cleaned splits from Section 3prepared_summary = summarize_section(prepared, symbol="NVDA", quarter="Q2", year="2024")qna_summary = summarize_section(qna, symbol="NVDA", quarter="Q2", year="2024")final_one_pager = f"""# {symbol} Earnings One-Pager — {quarter} {year}## Prepared Remarks — Key Points{prepared_summary}## Q&A Highlights{qna_summary}""".strip()print(final_one_pager[:1200]) # preview Tips that keep quality high: Keep temperature low (≈0.2) for factual tone. If a section is extremely long, chunk at ~5–8k tokens, summarize each chunk with the same prompt, then ask the model to merge chunk summaries into one section summary before producing the final one-pager. If you also fetched headline numbers (EPS/revenue, guidance) earlier, prepend them to the prompt as brief context to help the model anchor on the right outcomes. Building the End-to-End Pipeline At this point, we have all the building blocks: the FMP API to fetch transcripts, a cleaning step to structure the data, and Groq LLM to generate concise summaries. The final step is to connect everything into a single workflow that can take any ticker and return a one-page earnings call summary. The flow looks like this: Input a stock ticker (for example, NVDA). Use FMP to fetch the latest transcript. Clean and split the text into Prepared Remarks and Q&A. Send each section to Groq for summarization. Merge the outputs into a neatly formatted earnings one-pager. Here’s how it comes together in Python: def summarize_earnings_call(symbol, quarter, year, api_key, groq_key): # Step 1: Fetch transcript from FMP url = f"https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={quarter}&year={year}&apikey={api_key}" resp = requests.get(url) resp.raise_for_status() data = resp.json() if not data or "content" not in data[0]: return f"No transcript found for {symbol} {quarter} {year}" text = data[0]["content"] # Step 2: Clean and split clean_text = re.sub(r'\s+', ' ', text).strip() if "Question-and-Answer" in clean_text: prepared, qna = clean_text.split("Question-and-Answer", 1) else: prepared, qna = clean_text, "" # Step 3: Summarize with Groq prepared_summary = summarize_section(prepared, symbol, quarter, year) qna_summary = summarize_section(qna, symbol, quarter, year) # Step 4: Merge into final one-pager return f"""# {symbol} Earnings One-Pager — {quarter} {year}## Prepared Remarks{prepared_summary}## Q&A Highlights{qna_summary}""".strip()# Example runprint(summarize_earnings_call("NVDA", 2, 2024, API_KEY, GROQ_API_KEY)) With this setup, generating a summary becomes as simple as calling one function with a ticker and date. You can run it inside a notebook, integrate it into a research workflow, or even schedule it to trigger after each new earnings release. Free Stock Market API and Financial Statements API... Conclusion Earnings calls no longer need to feel overwhelming. With the Financial Modeling Prep API, you can instantly access any company’s transcript, and with Groq LLM, you can turn that raw text into a sharp, actionable summary in seconds. This pipeline saves hours of reading and ensures you never miss the key results, guidance, or risks hidden in lengthy remarks. Whether you track tech giants like NVIDIA or smaller growth stocks, the process is the same — fast, reliable, and powered by the flexibility of FMP’s data. Summarize Any Stock’s Earnings Call in Seconds Using FMP API was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story
Share
Medium2025/09/18 14:40