Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.
apimade16 minutes ago
It is strange to launch this type of functionality with not even a privacy policy in place.
It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.
Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.
Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA.
I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.
mchiang23 minutes ago
We work with search providers and ensure that we have zero data retention policies in place.
The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.
kingnothing18 minutes ago
You can say you're training an AI model and do whatever you want with it.
sorenjan1 hour ago
I had no idea they had their own cloud offering, I thought the whole point of Ollama was local models? Why would I pay $20/month to use small inferior models instead of using one of the usual AI companies like OpenAI or even Mistral? I'm not going to make an account to use models on my own computer.
ricardobeat1 hour ago
For models you can't run locally like gpt-oss-120b, deepseek or qwen3-coder 480b. And a way for them to monetize the success of Ollama.
mchiang1 hour ago
Fair question. Some of the supported models are large and wouldn't fit on most local devices. This is just the beginning, and Ollama does not need to exclude cloud hosted frontier models either with the relationship we've built with the model providers. We just have to be mindful and understand that Ollama stands with developers, and solve the needs.
> Some of the supported models are large and wouldn't fit on most local devices.
Why would I use those models on your cloud instead of using Google's or Anthropic's models? I'm glad there are open models available and that they get better and better, but if I'm paying money to use a cloud API I might as well use the best commercial models, I think they will remain much better than the open alternatives for quite some time.
mchiang1 hour ago
When we started Ollama, we were told how open-source (open-weight wasn't a term back then) will always be inferior to the close-sourced models. This was 2 years ago (Ollama's birthday is July 18th, 2023).
Fast forward to now, open models are quickly catching up, and at a significantly lower price point for most and can be customized for specific tasks instead of being general purpose. For general purpose models, absolutely the closed models are currently dominating.
typpilol22 minutes ago
Ya a lot of ppl don't realize you could spend 2k on a 5090 to run some of the large models.
Or spend 20 a month for models even a 5090 couldn't run. And not have to spend your own electricity, hardware, maintenance, updates etc.
oytis34 seconds ago
20 a month for a commercial model is price dumping financed by investors. For ollama it's hopefully a sustainable price.
ineedasername1 hour ago
A person can use Google’s Gemma models on ollama’s cloud and possibly pay less. And have more quality control that way (and other types of control I guess) since there is no don’t need to wonder if a recent model update or load balance throttling impacted results. Your use case doesn’t generalize.
disiplus1 hour ago
hi, to me this sounds like you are going into the direction of openrouter.
dcreater59 minutes ago
Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.
all248 minutes ago
What sort of monetization model would you like to see? What model would you deem acceptable?
dcreater15 minutes ago
Ollama , the local inference platform, stays completely local. Maintained by a non-profit org with dev time contributed to by for profit company. The VC backed company can make their cloud inference platform and can use ollama as a platform to market and even integrate, but keep it as a separate product (not named ollama).
This is almost exactly how duckdb/motherduck functions and I think theyre doing an excellent job.
kergonath25 minutes ago
As long as the software that runs locally gets maintained (and ideally improved, though if it is not I’ll simply move to something else), I find it difficult to be angry. I am more annoyed by software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.
dcreater18 minutes ago
> software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.
This is the play. Its only a matter of time till they do it. Investors will want their returns
coffeecoders54 minutes ago
On a slightly related note-
I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.
My rough idea for architecture:
- Crawler: A lightweight scraper that visits each site periodically.
- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.
- Storage: Store raw HTML and text locally, maybe compress older snapshots.
- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.
I would do periodic updates and build a small web UI to browse.
Anyone tried it or are there similar projects?
harias29 minutes ago
YaCy (https://yacy.net) can do all this I think. Cloudflare might block you IP pretty soon though if you try to crawl.
matsz46 minutes ago
You could take a look at the leaked Yandex source code from a few years ago. I'd believe their architecture should be decent enough.
I wish they would instead focus on local tool use. I could just use my own web search via brave api.
parthsareen55 minutes ago
Hey! Author of the blogpost and I also work on Ollama's tool calling. There has been a big push on tool calling over the last year to improve the parsing. What's the issues you're running into with local tool use? What models are you using?
vrzucchini24 minutes ago
Hey, unrelated to the question you're answering but where do I see the rate limits for free and paid tiers?
mrkeen1 hour ago
Any tips on local/enterprise search?
I like using ollama locally and I also index and query locally.
I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.
ineedasername41 minutes ago
You can use solr, very good full text search and it has an mcp integration. That’s sufficient on its own and straightforward to setup:
A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.
Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?
all234 minutes ago
docling might be a good way to go here. Or consider one of the existing full text search engines like Typesense.
frabonacci54 minutes ago
This is a nice first step - web search makes sense, and it’s easy to imagine other tools being added next: filesystem, browser, maybe even full desktop control. Could turn Ollama into more than just a model runner. Curious if they’ll open up a broader tool API for third-party stuff too
MisterBiggs1 hour ago
I was hoping for more details about their implementation, I saw ollama as the open source // platform agnostic tool but I worry their recent posturing is going against that
jmorgan1 hour ago
We did consider building functionality into Ollama that would go fetch search results and website contents using a headless browser or similar. However we had a lot of worries about result quality and also IP blocking from Ollama creating crawler-like behavior. Having a hosted API felt like a fast path to get results into users' context window, but we are still exploring the local option. Ideally you'd be able to stay fully local if you want to (even when using capabilities like search)
dcreater57 minutes ago
Their posture has continually been getting worse and worse. It's deceptive and I've expunged it from all my systems
nextworddev26 minutes ago
Can someone tell me how much this costs and how this compares to Tavily etc
typpilol20 minutes ago
Taviy gives you 1k free requests a month.
Even with heavy ai usage I'm only at like 400/1000 for the month
dumbmrblah34 minutes ago
What is the data retention policy for the free account versus the cloud account?
drnick149 minutes ago
What "Ollama account?" I am confused, I thought the point of Ollama was to self-host models.
mchiang18 minutes ago
To provide additional features or using Ollama's cloud hosted models, you can signup for an Ollama account.
For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.
throwaway12345t1 hour ago
Do they pull their own index like brave or are they using Bing/Google in the background?
tripplyons1 hour ago
Based on the fact that there are very few up-to-date English-language search indexes (Google, Bing, and Brave if you count it), it must be incredibly costly. I doubt they are maintaining their own.
throwaway12345t1 hour ago
We need more indexes
JumpCrisscross56 minutes ago
> We need more indexes
Not particularly. Indexes are sort of like railroads. They're costly to build and maintain. They have significant external costs. (For railroads, in land use. For indexes, in crawler pressure on hosting costs.)
If you build an index, you should be entitled to a return on your investment. But you should also be required to share that investment with others (at a cost to them, of course).
tripplyons38 minutes ago
More competition in the space would be great for me as a consumer, but the problem is that the high fixed costs make starting an index difficult.
ineedasername39 minutes ago
Do we know what OpenAI uses? Have they built their own, or piggy back on moneybags $MS and Bing?
Does this work with (tool use capable) models hosted locally?
parthsareen54 minutes ago
Hi - author of the post. Yes it does! The "build a search agent" example can be used with a local model. I'd recommend trying qwen3 or gpt-oss
lxgr44 minutes ago
Very cool, thank you!
Looking forward to try it with a few shell scripts (via the llm-ollama extension for the amazing Python ‘llm’) or Raycast (the lack of web search support for Ollama has been one of my biggest reasons for preferring cloud-hosted models).
parthsareen40 minutes ago
Since we shipped web search with gpt-oss in the Ollama app I've personally been using that a lot more especially for research heavy tasks that I can shoot off. Plus with a 5090 or the new macs it's super fast.
yggdrasil_ai1 hour ago
I don't think ollama officially supports any proper tool use via api.
lxgr57 minutes ago
Huh, I was pretty sure I used it before, but maybe I’m confusing it with some other python-llm backend.
It depends on the model. Deepseek-R1 says it supports tool use, but the system prompt template does not have the tool-include callouts. YMMV
chungus422 hours ago
My biggest gripe with small models has been the inability to keep it informed with new data. Seems like this at least eases the process.
mchiang2 hours ago
I was pleasantly surprised on the model improvements when testing this feature.
For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.
For larger models, it can start functioning as deep research.
bigyabai1 hour ago
> Create an API key from your Ollama account.
Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.
disiplus1 hour ago
they had at some point start earning money.
bigyabai1 hour ago
At some point you have to earn user trust. If Ollama won't be the Open Source Ollama API provider, there are several endpoint-compatible alternatives happy to replace them.
From where I'm standing, there's not enough money in B2C GPU hosting to make this sort of thing worthwhile. Features like paid search APIs this really hammer home how difficult it is to provide value around that proposition.
That's what i have together with open webui and gpt-oss-120b. it works reasonably well. But sometimes the searches are slow.
tripplyons1 hour ago
You can try removing search engines that fail or reducing their timeout setting to something faster than the default of a few seconds.
disiplus1 hour ago
SearXNG is fast, its mostly the code that triggers the searches. Because, my daily is chatgpt, i still did not try to tweak it.
tripplyons1 hour ago
I haven't needed to tweak mine for similar reasons, but I'm surprised to hear that the "code that triggers the searches" is slow. Are you referring to something in Open WebUI?
I haven't tried SearXNG personally. How does it compare to Ollama's web search in terms of the search content returned?
tripplyons1 hour ago
I have no idea how well Ollama's works, but I haven't ran into any issues with SearXNG. The alternatives aren't worth paying for in any use case I've encountered.
I'd love to know what search engine provider they're using under the hood for this. I asked them on Twitter and didn't get a reply (yet) https://twitter.com/simonw/status/1971210260015919488
Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.
It is strange to launch this type of functionality with not even a privacy policy in place.
It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.
Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.
Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA.
https://oag.ca.gov/privacy/ccpa
I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.
We work with search providers and ensure that we have zero data retention policies in place.
The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.
You can say you're training an AI model and do whatever you want with it.
I had no idea they had their own cloud offering, I thought the whole point of Ollama was local models? Why would I pay $20/month to use small inferior models instead of using one of the usual AI companies like OpenAI or even Mistral? I'm not going to make an account to use models on my own computer.
For models you can't run locally like gpt-oss-120b, deepseek or qwen3-coder 480b. And a way for them to monetize the success of Ollama.
Fair question. Some of the supported models are large and wouldn't fit on most local devices. This is just the beginning, and Ollama does not need to exclude cloud hosted frontier models either with the relationship we've built with the model providers. We just have to be mindful and understand that Ollama stands with developers, and solve the needs.
https://ollama.com/cloud
> Some of the supported models are large and wouldn't fit on most local devices.
Why would I use those models on your cloud instead of using Google's or Anthropic's models? I'm glad there are open models available and that they get better and better, but if I'm paying money to use a cloud API I might as well use the best commercial models, I think they will remain much better than the open alternatives for quite some time.
When we started Ollama, we were told how open-source (open-weight wasn't a term back then) will always be inferior to the close-sourced models. This was 2 years ago (Ollama's birthday is July 18th, 2023).
Fast forward to now, open models are quickly catching up, and at a significantly lower price point for most and can be customized for specific tasks instead of being general purpose. For general purpose models, absolutely the closed models are currently dominating.
Ya a lot of ppl don't realize you could spend 2k on a 5090 to run some of the large models.
Or spend 20 a month for models even a 5090 couldn't run. And not have to spend your own electricity, hardware, maintenance, updates etc.
20 a month for a commercial model is price dumping financed by investors. For ollama it's hopefully a sustainable price.
A person can use Google’s Gemma models on ollama’s cloud and possibly pay less. And have more quality control that way (and other types of control I guess) since there is no don’t need to wonder if a recent model update or load balance throttling impacted results. Your use case doesn’t generalize.
hi, to me this sounds like you are going into the direction of openrouter.
Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.
What sort of monetization model would you like to see? What model would you deem acceptable?
Ollama , the local inference platform, stays completely local. Maintained by a non-profit org with dev time contributed to by for profit company. The VC backed company can make their cloud inference platform and can use ollama as a platform to market and even integrate, but keep it as a separate product (not named ollama).
This is almost exactly how duckdb/motherduck functions and I think theyre doing an excellent job.
As long as the software that runs locally gets maintained (and ideally improved, though if it is not I’ll simply move to something else), I find it difficult to be angry. I am more annoyed by software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.
> software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.
This is the play. Its only a matter of time till they do it. Investors will want their returns
On a slightly related note-
I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.
My rough idea for architecture:
- Crawler: A lightweight scraper that visits each site periodically.
- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.
- Storage: Store raw HTML and text locally, maybe compress older snapshots.
- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.
I would do periodic updates and build a small web UI to browse.
Anyone tried it or are there similar projects?
YaCy (https://yacy.net) can do all this I think. Cloudflare might block you IP pretty soon though if you try to crawl.
You could take a look at the leaked Yandex source code from a few years ago. I'd believe their architecture should be decent enough.
Have you ever tried https://marginalia-search.com ? I love it.
I wish they would instead focus on local tool use. I could just use my own web search via brave api.
Hey! Author of the blogpost and I also work on Ollama's tool calling. There has been a big push on tool calling over the last year to improve the parsing. What's the issues you're running into with local tool use? What models are you using?
Hey, unrelated to the question you're answering but where do I see the rate limits for free and paid tiers?
Any tips on local/enterprise search?
I like using ollama locally and I also index and query locally.
I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.
You can use solr, very good full text search and it has an mcp integration. That’s sufficient on its own and straightforward to setup:
https://github.com/mjochum64/mcp-solr-search
A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.
Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?
docling might be a good way to go here. Or consider one of the existing full text search engines like Typesense.
This is a nice first step - web search makes sense, and it’s easy to imagine other tools being added next: filesystem, browser, maybe even full desktop control. Could turn Ollama into more than just a model runner. Curious if they’ll open up a broader tool API for third-party stuff too
I was hoping for more details about their implementation, I saw ollama as the open source // platform agnostic tool but I worry their recent posturing is going against that
We did consider building functionality into Ollama that would go fetch search results and website contents using a headless browser or similar. However we had a lot of worries about result quality and also IP blocking from Ollama creating crawler-like behavior. Having a hosted API felt like a fast path to get results into users' context window, but we are still exploring the local option. Ideally you'd be able to stay fully local if you want to (even when using capabilities like search)
Their posture has continually been getting worse and worse. It's deceptive and I've expunged it from all my systems
Can someone tell me how much this costs and how this compares to Tavily etc
Taviy gives you 1k free requests a month.
Even with heavy ai usage I'm only at like 400/1000 for the month
What is the data retention policy for the free account versus the cloud account?
What "Ollama account?" I am confused, I thought the point of Ollama was to self-host models.
To provide additional features or using Ollama's cloud hosted models, you can signup for an Ollama account.
For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.
Do they pull their own index like brave or are they using Bing/Google in the background?
Based on the fact that there are very few up-to-date English-language search indexes (Google, Bing, and Brave if you count it), it must be incredibly costly. I doubt they are maintaining their own.
We need more indexes
> We need more indexes
Not particularly. Indexes are sort of like railroads. They're costly to build and maintain. They have significant external costs. (For railroads, in land use. For indexes, in crawler pressure on hosting costs.)
If you build an index, you should be entitled to a return on your investment. But you should also be required to share that investment with others (at a cost to them, of course).
More competition in the space would be great for me as a consumer, but the problem is that the high fixed costs make starting an index difficult.
Do we know what OpenAI uses? Have they built their own, or piggy back on moneybags $MS and Bing?
They use Bing: https://www.forbes.com/sites/katherinehamilton/2023/05/23/ch...
Does this work with (tool use capable) models hosted locally?
Hi - author of the post. Yes it does! The "build a search agent" example can be used with a local model. I'd recommend trying qwen3 or gpt-oss
Very cool, thank you!
Looking forward to try it with a few shell scripts (via the llm-ollama extension for the amazing Python ‘llm’) or Raycast (the lack of web search support for Ollama has been one of my biggest reasons for preferring cloud-hosted models).
Since we shipped web search with gpt-oss in the Ollama app I've personally been using that a lot more especially for research heavy tasks that I can shoot off. Plus with a 5090 or the new macs it's super fast.
I don't think ollama officially supports any proper tool use via api.
Huh, I was pretty sure I used it before, but maybe I’m confusing it with some other python-llm backend.
Is https://ollama.com/blog/tool-support not it?
It depends on the model. Deepseek-R1 says it supports tool use, but the system prompt template does not have the tool-include callouts. YMMV
My biggest gripe with small models has been the inability to keep it informed with new data. Seems like this at least eases the process.
I was pleasantly surprised on the model improvements when testing this feature.
For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.
For larger models, it can start functioning as deep research.
> Create an API key from your Ollama account.
Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.
they had at some point start earning money.
At some point you have to earn user trust. If Ollama won't be the Open Source Ollama API provider, there are several endpoint-compatible alternatives happy to replace them.
From where I'm standing, there's not enough money in B2C GPU hosting to make this sort of thing worthwhile. Features like paid search APIs this really hammer home how difficult it is to provide value around that proposition.
Just set up SearXNG locally if you want a free/local web search MCP: https://gist.github.com/tripplyons/a2f9d8bd553802f9296a7ec3b...
That's what i have together with open webui and gpt-oss-120b. it works reasonably well. But sometimes the searches are slow.
You can try removing search engines that fail or reducing their timeout setting to something faster than the default of a few seconds.
SearXNG is fast, its mostly the code that triggers the searches. Because, my daily is chatgpt, i still did not try to tweak it.
I haven't needed to tweak mine for similar reasons, but I'm surprised to hear that the "code that triggers the searches" is slow. Are you referring to something in Open WebUI?
It's tools that you can install from open webui
https://openwebui.com/tools
I haven't tried SearXNG personally. How does it compare to Ollama's web search in terms of the search content returned?
I have no idea how well Ollama's works, but I haven't ran into any issues with SearXNG. The alternatives aren't worth paying for in any use case I've encountered.