Replies: 5 comments 3 replies
-
Here are the changes that I made for the web-search agent. To be sure I purged the old install of docker and made a new with the updated prompts.
I had to tweak a lot, a lot the embeddings to make the new prompt work as intended: as some combinations did not provide good results at all, like irrelevant results and the ai had to make up the answer alone, hallucinating results that were not there, no web results at all. |
Beta Was this translation helpful? Give feedback.
-
updated promt, possibly slightly better.
|
Beta Was this translation helpful? Give feedback.
-
Using the improved prompt Perplexica ranked better than Perplexity, Edge copilot, and Google gemini in web search. Summary of Performance Interaction Quality Relevance: Perplexica (Web AI 1) performed the best, accurately interpreting and responding to queries with a score of 8. Other AIs scored between 6 and 7. Content Relevance Depth: Perplexica provided the most comprehensive content, scoring 8. Other AIs were consistent but slightly less detailed. Related Searches Relevance: Suggested related searches were mostly relevant, with Perplexica and Perplexity Copilot performing the best. Conclusion Perplexica (Web AI 1) demonstrated the best overall performance across all criteria, particularly excelling in interaction quality and content relevance. Perplexity Copilot (Web AI 2) and Perplexity Vanilla (Web AI 3) also performed well but had slight areas for improvement. Microsoft Copilot (Web AI 4) and Google Gemini (Web AI 5) provided satisfactory performance but lagged behind in user experience and content depth. Overall, Perplexica's improved prompt and use of the phi3 medium AI model contributed significantly to its superior performance, making it the best choice for comprehensive, accurate, and user-friendly AI search interactions. |
Beta Was this translation helpful? Give feedback.
-
I need to be able to add and search on local data sources like a vectorDB? And also restrict the searches to certain urls how can this be done? |
Beta Was this translation helpful? Give feedback.
-
@Zirgite Thank you for your excellent work on improving and benchmarking Perplexica copilot. I am just a bit lost about the differences with your other implementations in #258, can you please clarify what is the intent here vs there? |
Beta Was this translation helpful? Give feedback.
-
I made some tests in order to compare Perplexity and Perplexica answers. For Perplexica I used Phi-3 model with new prompt that is using according to Meta's terminology system 2 thinking. I am currently doing tests. My test prompt is:
The results are anonymous as the AI model GPTo or Claude 3.5 should not know which system is which, AI system 1, 2 etc.
"Evaluate the performance of the following Web AI Search systems focusing on the interaction between the AI model and the search results. Prioritize the overall quality of interaction, followed by content relevance, and related searches. Use the following criteria:
1. Interaction Quality:
◦ Relevance: How accurately does the AI model interpret and respond to the query?
◦ Clarity: How clear and understandable are the AI-generated responses?
◦ Helpfulness: How useful are the AI responses in guiding the user to relevant information?
◦ User Experience: How intuitive and seamless is the interaction with the AI model?
2. Content Relevance:
◦ Depth: Does the content provided cover the query comprehensively?
◦ Accuracy: Is the content factually correct and well-researched?
◦ Authority: Are the sources of the content reputable and reliable?
3. Related Searches:
◦ Relevance: How relevant are the suggested related searches to the original query?
◦ Diversity: Do the related searches cover a wide range of subtopics related to the original query?
◦ Usefulness: Are the related searches useful in refining or expanding the search?
Use the following ranking system from 1 to 10 for each criterion, where 1 represents poor performance and 10 represents excellent performance.
Present the results in a tabular format
Several useful founds.
USEFUL FIND 1 Perplexity Copilot may be more useful when you want to interact with close to real time information, regarding planning trip, weather etc.
Perplexica has issues as the searx has to be asked to give current news for the last day. Even if the AI model makes the correct promt the search engine does not use real time information, but uses the general search
This can be improved as the vanilla Perplexica is able to go to the current news feed, like local weather now.
Regarding tough topics
example difficult search: Query: Explain the concept of quantum entanglement and its potential applications in quantum computing.
Perplexity vanilla does not fall behind to copilot. It is very close in other tests too, not lower than 10%.
Regarding that search Perplexica and Perlpexity and Perplexity copilot were on par with
Relevance | 9 | 9 | 8 | 9
Clarity | 8 | 8 | 7 | 8
Helpfulness | 9 | 9 | 8 | 9
User Experience | 8 | 8 | 7 | 8
USEFUL FIND 2 In hard topics but for which well established search results do exist Perplexity vanilla and Perlexica do not fall behind Perplexity copilot. Smaller models work exceptionally well
USEFUL FIND 3When I asked to get more depth and user-friendliness Perplexica delivered more.
USEFUL FIND 4 In Perplexica when changing the embedding the results can vary significantly. e.g. From local to ollama. also the different possiblities BGE etc. Here I get best results with ollama (not local) and llama 3 embeddings.
USEFUL FIND 5 The integration with the search engine can be improved to provide usefulness when searching for latest data like news
USEFUL FIND 6 Perplexity, windows copilot and Gemini are SPECIFICALLY instructed to be user friendly and get easy to understand results. And so when the topic is deep You need to tell them SPECIFICALLY that you are not the average user but a specialist as they do not know. I think that explains slightly lower scores.
Beta Was this translation helpful? Give feedback.
All reactions