Videos

Introduction To Perfect Search Technology

Introduction To Perfect Search Technology

What is Perfect Search?  We are an indexing system that took a very unique approach to creating the index and the index keys.  This allows us to very rapidly run searches that typically take a long time. Perfect Search does full text search and rank search and fielded search that data base typically do. What makes Perfect Search unique is the approach we take to build and query indexes, as well as the ability in real time to merge increment indexes and roll them out without interfering with the customers’ activity of querying the system. 

How much faster is Perfect Search? We are typically 10 times faster, but, when cached, we can be 100 times or a 1000 times faster.

Back to top

Search Appliance Solutions

Search Appliance Solutions

What is unique about the Perfect Search Appliance is that it can scale higher and support more data than anything else on the market. Perfect Search has a proprietary engine—a new innovation that’s radically different than any other search engine on the market. We have the world’s fastest search engine, because Perfect Search looks at a new approach to search and has implemented this approach in the most efficient way possible. Instead of having to go inside Exchange, Lotus Notes or a DB2 data base to do a query, we can run a search and have it go across the entire corpus of data inside an enterprise.

Back to top

Enterprise Search Expert, Stephen Arnold

Enterprise Search Expert, Stephen Arnold

There are two ways to do federation. The most common way is send a query to different sources and, when the results come back, you dedupe and combine them. The more supplicated way to perform federation is at the time of content processing, whereby you generate an index which already includes the federating metadata. The difference is that one is relatively easy, while the other is actually quite difficult yet yields a number of benefits.

What is the difficulty of hyper-federation? The challenge that I see in hyper-federation is dealing with the processing bottlenecks of input output, access to memory, and simply coping with large flows of data in a very short span of time—all present severe technical challenges. Yet in our tests with Perfect Search, it’s exciting to see that Perfect Search has cracked this problem.

What are the benefits achieved with hyper-federation?
The primary benefit is when a user wants to runs a single query and see results that are pertinent to that query, doing so without delay. So anything that speeds the display of the results after the query is sent to the index is a gain the user.  The payoff is at the user end.

What are your thoughts about Perfect Search?
Two things jumped out at us when we looked at Perfect Search: First, traditional data bases are expensive and stuffed with problems. Historically, quick access to large amounts of data creates costs, although this is not the case with Perfect Search. Secondly, when there is a problem, you want to be able to call a company and get an informed engineer to give you a precise answer. Perfect Search scored very, very well on both those counts: Perfect Search delivers great performance and outstanding technical support.

Explain Perfect Search’s OneBox Extender used in conjunction with the Google Search Appliance? Google is a company that changes very frequently. One thing that is very clear to us as we look at the Google Search Appliance is that for certain functions it is excellent. But in other functions, such as accessing structured data and merging that, it is not as effective.  So our theme at Arnold IT is to surf on Google, augment with the code developed by Perfect Search to connect to the Google Search Appliance, and the result is the ability to extend the Google Search Appliance capability without having any negative effect on the benefits that Google delivers with its product.

Back to top

Finding Your Data Super Fast and Cost Efficiently

Finding Your Data Super Fast and Cost Efficiently

Companies need to find information within their data bases quickly and with cost efficient solutions. However, as data bases grow exponentially in size, and as SQL queries become more complex, current search solutions struggle to meet these needs. The Perfect Search Appliance (PSA) or OneBox Extender (OBX) brings full search functionality to data bases including structured and unstructured or full text data.  This is done with Blazing speed, low hassle, significant capacity to scale, all delivering an accelerating return on investment. The OBX provides the ability to affordably search massive data bases.  It off loads the computational search loads from the data base to the OBX. It indexes and searches all of the data within the data base. It federates results across multiple data bases into one master index. It provides extremely fast query response times. It allows for continuous incremental indexing avoiding the problem of fragmented indexes. It searches both unstructured (full text) and structured data within data bases. And it searches CLOB text fields without any system degradation. To see how Perfect Search can be of benefit to your organization, visit www.perfectsearchcorp.com

Back to top

World's Fastest Search Technology - Part 1

World's Fastest Search Technology - Part 1

What is currently considered fast where enterprise search is concerned is not fast enough.  Perfect Search delivers a game-changing search core algorithm. Everything built around this core also enables blistering speeds. Perfect Search features a lean operating system. Everything runs as quickly as possible in the C+ and C++ code layer. We look at all the various pieces of our solutions and ensure we provide the best search result throughout the entire stack.

Back to top

World's Fastest Search Technology - Part 2

World's Fastest Search Technology - Part 2

Perfect Search Co-founder and Chief Scientist Ron Millett has spent 25 years in the field of search technology, beginning with traditional systems for Novell and WordPerfect. Perfect Search is ground-breaking, delivering radical changes to the entire design of search technology.  As Perfect Search fully developed the various components of this new software technology, the end result was that the initial predictions were found to be too conservative: the reality showed speeds and performance far beyond what was originally anticipated. More cycles were available to produce better search.  Search engines today utilize almost all of their capacity in the basic heart of the system, with no reserve capacity to advance features that engineers dream up.  Perfect Search’s new core has an entirely different search process architecture, thus the reason that our search has proven to be the world’s fastest enterprise search technology.

Back to top

World's Fastest Search Technology - Part 3

World's Fastest Search Technology - Part 3

Perfect Search doesn’t do certain things enabling our unmatched speed, precision and performance. We eliminate irrelevant information faster than any other indexing system. We focus on the most relevant information with the highest potential for accurate hits. Perfect Search first looks for the highest relevant searches, delivering faster results to the customer. 

Back to top

World's Fastest Search Technology - Part 4

World's Fastest Search Technology - Part 4

It is often hard for people to understand how Perfect Search can be so much faster than other search technologies. To explain the different, Perfect Search CEO Tim Stay likes to use the analogy between a prop engine on an airplane and a jet engine. Both will fly a plane’s fuselage.  Take a Cessna 182; it can travel a couple hundred miles an hour, carrying 4 or 5 people, traveling as high as 18,000 feet. Compare this against the Dassault jet engine with a turbo jet fan system.  This engine can travel 55,000 feet or higher; it can achieve close to the speed of sound (2000 miles an hour) and can carry many more people and more weight. The different engine gives you the ability to do things faster, higher and with far greater capacity.
This is analogous to the Perfect Search engine.  Because Perfect Search’s is a different engine, it’s not doing things the way other search engines do.  It can search more data.  It can search across more types of data.  It can do more with a search.  It can search many different types of things—personalized search and approaching artificial intelligence type searching.  All of these are benefits from the speed of this revolutionary, new engine. Using the older type of search technology, it was impossible to achieve this type of performance.  But with Perfect Search’s new technology—this new invention that does search differently than it has ever been done before—we are able to achieve unrivaled speed and performance. We will prove it. If you provide a sample data set, we’ll come into your labs and are confident that we can demonstrate the performance of the Perfect Search engine, proving that we are the industry’s fastest enterprise search technology.

Back to top

What is the Perfect Search One Box Extender?

What is the Perfect Search One Box Extender?

The Perfect Search OneBox Extender (OBX) allows our search results to show up federated with the Google Search Appliance (GSA) search results.  If you have a GSA and want to incorporate Perfect Search’s technology, the Perfect Search OBX allows you to send a query out to our search technology and get combined results back through a single UI. This is useful, for example, if you would like to index a database with a large number of rows that would be cost-prohibitive and outside the scope of your GSA license. You could utilize Perfect Search’s technology which costs considerably less per license item than the GSA. To get those results and federate them together with your intranet, you’re also searching web pages with your GSA and the OBX by Perfect Search can also federate them together.

Back to top

The Value of the Perfect Search Appliance

The Value of the Perfect Search Appliance

Perfect Search Senior Architect Daniel Hardman explains the value of the Perfect Search Appliance. In the enterprise world there is data of all varieties that is growing at a prodigious rate.  It is very difficult for people who need to organize that data and search through that data to use the data for productive means or even to keep track of the data.  The obvious need for search is to allow people to answer questions—rein in the chaos, so to speak, and be able to address meaningful questions.  And those questions have very direct rubber meets the road kind of connotations for every business.

For example, a business that works with a large customer base, with heavy influx of incoming calls, may need to mine the data generated during those phone calls to determine the top issues. There are traditional search features in many pieces of enterprise software that allow you to drill down to certain aspects of your data.  But those search features are siloed in a CRM tool with features that enable you to search through customer interaction records.  What if you also need to search through a knowledge base and correlate those two things?  If you have two separate tools to do this it becomes problematic.  As the number data sources in the enterprise multiples, it becomes harder and harder to deal with that information in an intelligent way.

You may have internal process and resources whereby you need to access data bases that need to be scanned, email repositories that need to be searched.  All of these sources need to be corralled, and you need to be able to get useful pieces of information and insight from them. An appliance is an easy way to bundle up search functionality, without bringing a lot of new complexity to your environment.  It is possible to buy pieces of software and deploy them throughout an environment in such a way that you can gather data and make it searchable, but there is high maintenance cost to that and, in many cases, the complexity of deployment is intractable. On the other hand, with an appliance you add a machine to the mix, plug it in, cable it up, and point it at the data sources that you are interested in, and then just sit back and enjoy your search features.  The result is that you very quickly can be productive while enjoying low maintenance costs.  

There are other search appliance providers in the marketplace such as Google with its Google Mini and Google Search Appliance (GSA). What is unique about the Perfect Search Appliance is that it can scale higher and support more data than anything else on the market.  Where a GSA begins to meets its scaling limits at the 10s of millions of documents, a Perfect Search Appliance is just barely getting started—the capacity is immense, with the ability to search 1 billion records on a single server. And where a GSA starts to degrade in performance when you max out the amount of data it can hold, the Perfect Search Appliance is cheerfully returning queries at a rate of 1000’s of queries per second still when you have billions of records. Because of scaling and the cost of scaling and the practically of managing the complexity behind that scaling question, a Perfect Search Appliance is an excellent and unique solution in the market.

Back to top

The Perfect Search Appliance

The Perfect Search Appliance

Perfect Search CEO Tim Stay gives an overview of the Perfect Search Appliance (PSA). The Perfect Search appliance allows an enterprise to take their data and search across large amount of their data. It can search across different archives of data, including structured data inside a data base, unstructured data in HTLML format, email archives, and/or content management like SharePoint or Documentum or any similar product. Perfect Search can search across all of these with a single query.  Instead of having to go into and do a query inside of your Exchange and inside of your Lotus Notes and inside or your DB2 data base, Perfect Search can do a search that goes across the entire corpus of data that exists inside your enterprise.

Back to top

Enterprise Perfect Search Appliance

Enterprise Perfect Search Appliance

Enterprise search capability is divided into various components: ingestion, indexing, cache and query. All of these systems in an enterprise system may be found on large servers or scattered throughout the enterprise. In a search appliance, as is the case with the Perfect Search Appliance, all of these components are packaged together in a single server—pre-packaged and pre-configured as a whole system in a plug-n-play hardware with only minor configuration required. The Perfect Search Appliance is simple to set up and simple to use.

Back to top

Perfect Search's Technology and The GSA

Perfect Search's Technology and The GSA

The Perfect Search engine or search technology is considerably more adaptable and more flexible that the Google Search Appliance (GSA). The GSA does not allow nearly as much control over what gets fed in, when is gets fed in, and provides far less control than Perfect Search’s search technology. Perfect Search’s technology is a lot more open.  Our performance is a far superior to the GSA. We can handle far more records on a single box than the GSA, and Perfect Search can index and retrieve data much more quickly than the GSA.

Back to top

Real Time Indexing and Fragmented Index

Real Time Indexing and Fragmented Index

What are a real time index and a fragmented Index? Real time indexing means that we can absorb new data as people are using the system. The trick is how quickly you’re able to turn over the new information making it available for querying. We have a time where is fires off, but the customer can do it as frequently as desired. Were we roll the new indexes out, we typically roll out of a small incremental index for a small period of time; it goes very quickly and can be used very quickly in the system. Then we gradually have a merging scheme were these small indexes are merged in the back ground and rerolled out as a merged file.  That also happens very quickly without interrupting any of the querying activity on the system. The benefit is that the customer gets access very quickly to the new data that has been entered into the system. Our typical rollout times are usually about 30 seconds, but, in certain cases, we can speed this up. Fragmented indexes are when you have a lot of small pieces.  We naturally accumulate these during this real-time indexing, so each one of the files has to be searched separately in a linear fashion.  If you have too many of them then it slows down the search, so, simultaneously, Perfect Search can search them very quickly. Although, we know we don’t want gather too many, so we have an ongoing merge process in the background that merges the group of small files, with a rollout of a medium size file—all done without interrupting any of the query activity.

Back to top

How is Perfect Search Different Than a B-Tree Index - Part 1

How is Perfect Search Different Than a B-Tree Index - Part 1

A high level B-tree index is a tree base index that many data bases use to key into the values about which they are most interested. It is an order based index, with one of the benefits being that it is relatively easy to update a B-Tree index.  In a transactional data base, a B-Tree index allows rapid update of the index as changes are being made to the data base. Perfect Search uses a hash based index. The benefit of a hash index is that it allows us to not only scale better but also provides better performance.  We can look up things much more quikcly using our hash based technology than would be possible in B-tree based system.

Back to top

How is Perfect Search Different Than a B-Tree Index - Part 2

How is Perfect Search Different Than a B-Tree Index - Part 2

The B-Tree index is a very traditional index that most data base systems use. The advantage is being able to update them exactly; they are designed for rapid update, although they are not as good at rapid search.  At Perfect Search, we use hashing schemes which are order zero.  B-Tree are typically order log, which means for billions of trees you have to do many searches in order get to the exact search key.  Perfect Search, on the other hand, delivers unmatched speed. In the way we do our incremental updates, we add incremental files that are of the same type structure.  Where we would normally have search linear as the indexes are updated, the linear search is running, and Perfect Search merges in the background—merging all the small indexes into a larger index. Perfect Search has a rapid cut over to our new rolled out index, although it’s all background activity, getting back onto a single hashing scheme file. Perfect Search is extremely fast; even our linear searches across these small indexes are extremely fast. We merge, and then only search one index. This covers the fragmenting problem, the incremental update and speaks to the difference between Perfect Search and B-Tree.

Back to top

What Makes Perfect Search a Green Technology?

What Makes Perfect Search a "Green" Technology?

Because of all the data we Perfect Search can fit on a single server, it is a “green” technology. Compared to competitors, Perfect Search can have massive data on a single server. The result is that data centers require far fewer servers. With Perfect Search, a 10 to 1 reduction can be realized in search-related hardware. Much less hardware to power and cool equals a green technology.

Back to top

What is Incremental Indexing? - Part 1

What is Incremental Indexing? - Part 1

What is Incremental Indexing? It is when new data arrives and a quick rollout is needed.  A small file of indexed information is what causes the fragmenting. It’s all related: you have many small files and you want to get them out quickly, so they must be merged, all while not wanting to interfere with any of the querying activity that’s occurring. So, we are willing to run a little bit slower, although slower for Perfect Search equates to only a few more milliseconds. We may have to search 100s of very small files and get them merged up into a medium size file.  Then, typically at night, you’ll take your medium size files and merge them up into the big file. This means that periodic merging is taking place. At Perfect Search, we typically use a custom script to merge in the appropriate manner according to the customers’ needs. 

Back to top

What is Incremental Indexing? - Part 2

What is Incremental Indexing? - Part 2

When new data arrives, it needs to quickly be rolled out as small files of indexed information. These small files need to be merged into the main index with no interference with the on-going querying activity. Perfect Search has a unique ability to do this seamlessly.

Back to top

The Difference Between Structured and Unstructured Data

The Difference Between Structured and Unstructured Data

Structured data is a data base—many fields of data and very well defined types of data in each field. Unstructured data would include things like newspaper articles or documents that do not contain fields. Perfect Search can equally search both structured and unstructured data—we can feed in unstructured data, index it and make it available for search with incredible speed.

Back to top

The Importance of Being a Disk Based Search Engine

The Importance of Being a Disk Based Search Engine

The importance of being a disk space search system is size.  You’re able to store much more data on a disk than can fit in memory.  A search engine like Perfect Search’s can search a disk as fast as competing solutions can search in a memory base system.   We’re then able to put far more data on a single server.  The end result is that we have one search server that can handle all of the searches rather than having it spread across multiple servers that are all memory based search engines. Not only can Perfect Search do these searches faster from the disk, but we can also cache the searches in memory so we can be extremely fast in memory based search.

Back to top

Why is Fast Search so Important?

Why is Fast Search so Important?

Fast search is critical in order to do more with search—getting more detail to customers, for instance, requires multiple searches. Perfect Search can do 100 searches in the same time it takes our competitors to do one search, so we’re able to add in many more features. These features are not being utilized in search today, because of the time factor—because most search technologies do not have the capacity and performance to do so. Perfect Search has the speed that makes these multiple searches and feature incorporation possible.

Back to top

Database CLOBs and The Struggle with Indexing Them

Database CLOBs and The Struggle with Indexing Them

A data base CLOB is a large data base object that is use to store a lot of characters: the CLOB stands for (Character Large Object Data Base). The reason data bases have traditionally struggled in indexing large data base objects such as a CLOB is because they do complete compares of the object in order to fit them into the B-Tree index. When you have a large object, doing this isn’t feasible or even possible, since most of them don’t even allow you to do this.  The result is that your indexing is not complete.  When tying to query these items, because they are not able to create and index for them, a linear scan is necessary.  This can be very expensive.  You may have a million items and you are linearly scanning through them to try and find a key word.  This is extremely expensive.  This is why they have traditionally struggled—not only because of the indexing but also the querying.  Many data bases do not allow you to do a wild card search on these kinds of objects, again, because it is too expensive.

Perfect Search solves this problem by doing a complete index on the different parts of the CLOB.  Instead of indexing the whole object, Perfect Search breaks the individual words and phrases and then indexes.  When you search for a key word, instead of having to do a linear scan, the Perfect Search indexing system allows you to key in your exact search with sub-second retrieval.

Back to top

What is a Feeder and a Connector?

What is a Feeder and a Connector?

Our feeder is a technology that we have developed here at Perfect Search enabling us to bridge between a data source and a date repository. In a common case for us, we’ll bridge a data source like a data base and feed it into our search engine. The feeder not only allows us to connect to different data sources, but it also allows us to filter and clean the data that we are receiving, so that it is more normalized or more accurate.  You can solve a lot of problems even before the data gets indexed by using Perfect Search’s feeder technology. In contrast with the feeder, the connector is just one component of the feeder: it is the logic and the code that allows us to connect to a data source. So, we could have an Oracle connector, and it then allows us to take our feeder technology and connect to an Oracle data base, feeding data out of the Oracle data base. Perfect Search has connectors for all the major data bases: for content management systems, like SharePoint and Documentum, as well as connectors for email services such as Exchange and other email type applications.

Back to top

Perfect Search and IBM DB2 Full Text Search

Perfect Search and IBM DB2 Full Text Search

How does Perfect Search compare with IBM DB2 full text search? IBM DB2 full text search engine is feature packed and very rich in the components that it supports. Perfect Search has a query technology that radically improves performance and scalability, especially for unstructured data. Perfect Search is not packed with all the rich supporting components that DB2 delivers, thus Perfect Search is much more lean and out performs DB2 four to 10 times in most applications.

Back to top

Windows Mobile Device Search

Windows Mobile Device Search

Windows Mobile is a plate form put out by Microsoft that enables cellular phones and other portable devices to have their own host operating system and data located right on the device. Built into that system is a scan search product that allows search but may only search the metadata and files located in a few directories from the top of a directory tree. This scan search process takes a long time and is very intensive in the use of processor and battery life on the mobile device. Perfect Search has produced a version of our enterprise search platform to specifically accommodate handheld mobile devices. We’ve scaled down our enterprise search technology enabling it to be installed directly onto mobile devices enabling the search not only of the metadata but also of the full context of emails and files that are stored on the mobile device.  The search results, using Perfect Search’s propitiatory technology, are returned in less than a second.  This means when you are using your mobile phone, you get the results right away—you don’t have to drain your battery or wait a minute or more to get the results back you were looking for.

Back to top

Query Types, Stemming, Wildcard, Range Searching and more..

Query Types, Stemming, Wildcard, Range Searching & more...

This highlights the different types of queries that can be used with the Perfect Search engine. Stemming is a common type of query that is useful:  the stem form of a word is used instead of its inflected forms. This enables you to find different forms of the word that may appear in documents or the data base through which you’re searching, providing more relevant query results. Another type of query is a wild card search—where you’re looking for part of the word. Range Search is yet another important type of search, used, for example, when searching for dates. Range searches are done very efficiently in the Perfect Search system, enabling the quick elimination of all the irrelevant date ranges, with a heightened focus on those dates that are of importance. Search types can be used in combination allowing you to have keyword search or a phrase search that includes a date range to narrow results. Searches can also be used around the concept of auto completion: using the search engine to help fill in the results or the query that you would like to build as you go.  The potential search categories may be pertinent enabling you to quickly discover taxonomy or structure given to categorized data.  This also helps in searching an entire body of data to find possible words that already exist in the corpus, enabling you to hone in on those that already exist.

Back to top

Perfect Search Engine, Is it Proprietary or Existing Technology?

Perfect Search Engine, Is it Proprietary or Existing Technology?

Perfect Search CTO Ken Ebert talks about Perfect Search’s proprietary search engine, explaining that it is a new invention: It is different, radically so from many of the other search engines on the market today. This means that the algorithm that Perfect Search has developed is unique to our approach.  This is what gives Perfect Search our extra speed, scalability, performance and indexing, as well as our ability to handle the diverse groups of data that we bring together, all in one single index to be queried. Many of the open source or generally available search engines do not perform as well. Shortcomings include interims of the number of queries per second they can handle, their ingestion rates, how fast they can index data, or the breath of the data that they are able to handle. They reach maximum capacity on a server long before the Perfect Search solution would reach its capacity on the same server. Perfect Search unique, proprietary system allows us to provide these extra functionalities and capabilities that are not available in open source searches.

Back to top

What is a Query Farm and Why Would I Need One?

What is a Query Farm and Why Would I Need One?

Why would a customer need a query farm or a search farm? Typically search requires computing resources, and once the computing resources of a single server are exceeded, either from the query volume and the number of queries a customer is going to ask or the amount of data that is going to be indexed, you need more than one server. To coordinate the efforts of more than one server, it’s typically done in a farm—an array of servers to satisfy those needs. You’ll add rows of data to handle wider and wider amounts of data that won’t fit on a single server, and you might add columns or different number of servers to get query performance that you need.  If your server can produce only 25 queries per second, for example, you may need three servers in order to satisfy a load of 75 queries per second. If you can fit two terabytes worth of index on a single server, and you have 10 terabytes of data, then you would need 5 servers to satisfy the width of that index.  The result would be, 3 times 5 servers, which would be 15 servers in your query farm to satisfy the extra load that you have that exceeds that of a single server.

Back to top

What is Rank Search?

What is Ranked Search?

Delivering the most relevant information about your search—bringing it to the top of your search results. The document title is of highest importance. Abstract of the document is second. Content in the body of the text is the third most important. If the title contains the query search, as well as the abstract, and it is bolded in the body of the text, the search ranking would be at the top.

Back to top

What is a Scan Search?

What is a Scan Search?

What happens when you’re out in the weeds of your search system? This is probably familiar as you search in Windows XP with the little dog.  When you search with the Scan Search, the little dog walks around and around and you might even have to take a coffee break or lunch before you fine what is on your computer.  With an Index Search solution you look at all the items in advance. When you ask the question you know exactly where it is.  Liken this to a person who knows where everything is in their house. The other place you get in to trouble is in a system like SQL where you have to predefine what indexes exist.  What happens if you’re not asking the same question that the index reflects?  What if you want to know what people exist in the system rather than what ID they have? With a system Like Perfect Search you index across all the different facets of your data.  So, when someone asks a question wasn’t predetermined, you already have those pieces of information in the index and are able to return a result quickly.  This eliminates waiting for the little dog turning around in circles while he tries to find your answer in XP.

Back to top

Perfect Search's Supported Platforms

Perfect Search's Supported Platforms

Perfect Search’s search algorithm is exceptional and one of the reasons is because of the way it scales. It is written in very portable C++ and C+ Code.  This allows us to scale all the way down to embedded processors and mobile devices, while scaling all the way up to high performance scenarios like supercomputing or entire server farms. We have a Macintosh solution for their Intel chip. We have solutions for 32 bit Windows and 64 bit Windows; we’ve scaled across the different flavors and versions. Because of the way we are written, we have scaled across all of the different version s of Linux. We’ve also been very successful at getting our search algorithms and software stack to run the plat forms that our customers need.

Back to top

Oracle SES and the Perfect Search's Search Appliance

Oracle SES and the Perfect Search's Search Appliance

We compare Oracle’s secure enterprise search to Perfect Search’s search appliance. The Perfect Search Appliance uses our best-of-breed indexing technology and wraps it to a standalone appliance used to search documents, data bases, and different types of data in your organization. Oracle’s secure enterprise search is built around their database technology and around their Oracle text based search solution inside of their data base to accomplish much the same task.  But because you are using all these large pieces that are part of much bigger enterprise stack, you end of paying a lot of overhead for pieces that are really part of the search solution to begin with. As a result, you end up with superior performance from a much more lean system, where you have the very best search engine comprised solely of the pieces you need to provide that web interface and that user layer on top of it. With an Oracle solution you’re paying for the data base management layer, you’re paying for all that enterprise stack that does necessarily contribute to the users experience on that stack. As a result, you end up paying a lot more for all the required software hardware. With the Perfect Search solution, you end up with a solution tailored to be just exactly what you need to search all the different data bases and data stores on your enterprise.

Back to top

How Does Enterprise Search Differ From Internet Search

How Does Enterprise Search Differ From Internet Search

Internet Search equally searches all web pages.  Enterprise Search searches different discrete solos of data: RMDS, Accounting, SharePoint, Enterprise Contact Management, Email, etc.  Perfect Search unifies the various disparate data sources into one single query, providing federated search.

Back to top

The Difference Between a Crawler and a Spider

The Difference Between a Crawler and a Spider

When you need find your data as you do when you need to import into a search system, you need to go out and lock all the data that’s out on your network.  You can do this in a variety of ways: You can crawl the data; tell the crawler where (as you would call it) the data is located, and it will walk through the items. The difference between the crawler and the spider is that a spider goes through crawlers—it’s going to look at the contents of each recourses and determine if there is a link to the other item.  It will then go and retrieve the other item, as well. When you spider, you’re crawling the whole web of data including the links from one piece of data to another. When your crawl, you’re simply going out and enumerating the resources at which you’ve been pointed.

Back to top

Why is Speed Important to Search Deployment

Why is Speed Important to Search Deployment

The Perfect Search engine has incredible speed which can be a valuable asset to you in your deployment. Perfect Search often shows benchmark numbers that go into the 1000’s of queries per second including the full network turnaround time of those queries.  You may look at these numbers and ask, “I may get 1000 queries in a day; why would I need 1000 queries in a second?”  

When you have the kind of speed Perfect Search delivers, it enables tremendous functionality that you simply could not explore with a traditional approach.  It enables you to conduct detailed faceted search. You can allow the user to turn around and ask a more specific query about their initial question in the amount of time where another engine may not have even returned the initial response. If you search for a complex query you receive responses quickly enough that you’re able to change your mind, or you can add more criteria to that search in order to get an even more precise answer more quickly. Perfect Search enables functionality like facets: you can add on categorizations, you can sort your answers based individual searches that you might perform.  Perfect Search enables far superior functionality, capacity and performance that are simply not available with other traditional search technology solutions.

Back to top

Open Source Search Solutions

Open Source Search Solutions

Compare Perfect Search to other open source solutions. Perfect Search’s engine is proprietary and closed.  Perfect Search does, though, open our connector story and allow the interface with all the pieces required to work in a free and open manner.  You can connect all the pieces in your enterprise and in your system with ease. Perfect Search can integrate with ease, like a free solution such as Lucene, although Lucene has to be evaluated on all aspects, not just the fact that it is cost-free or even free as in open source.  You need to evaluate whether the solution itself will scale to your environment. This is an area where Perfect Search compares quite favorably due to the fact that our query syntax is similar to Lucene.  Perfect Search found that most people moving from Lucene can scale better and can drop in a Perfect Search replacement to Lucene with little cost involved.   Compared to Lucene, Perfect Search enables far greater performance, speed, capacity and scalability.

Back to top

Perfect Search and Microsoft SQL Search

Perfect Search and Microsoft SQL Search

Perfect Search and MySQL Server are competitive technologies, although they primarily address different problems with some overlap. SQL Server is oriented and optimized towards a use case of structured data: a classic example includes names, addresses, phone numbers, notes, zip codes etc. The end result is a data base table with a million records with every record is sharing a common structure. Traditional relational data base allows you to efficiently indentify subsets of your data sets based on the value in a particular field. Perfect Search fully supports structured text search, although Perfect Search equally supports unstructured search.  A tremendous amount of data needs to be searched that may be hiding in a place that is a lot less organized than a customer service data base.  For example: In the world of genealogy, you may have records about people that lived 100s of years ago. Records could include military discharge papers, unstructured data such as a letter that was scribbled on a piece of paper, obviously not in nice, tidy columns with the name, serial number and rank of this person, although this information may appear somewhere in the text.   Perfect Search is exceptional at searching this type of unstructured data.
Perfect Search also provides tremendous performance with structured data search. Combine the power of SQL Server and the flexibility and power of Perfect Search and you have a best-of-breed solution. You can find data no matter how it’s encoded or structured across your enterprise.  The recommended customer solution regarding deployment of Perfect Search’s search technology where SQL Server is also involved is to think about a seamless partnership.  Allow SQL Server to perform where it effectively addresses its use case, and then plug the holes with Perfects Search.

Back to top

Full Text Search, Faceted Search and a Taxonomy

Full Text Search, Faceted Search and a Taxonomy

Full Text Search: Unstructured data. DBMS Search: Search on structured data.  Faceted Search: Drill in to progressively narrow what you are looking for. For example, you’re on eBay searching for a 1967 sports car. Your result shows 1120 eBay list categories on the side; 110 Ford, 492 Chevy, 145 Chrysler, and  then you go to Ford and see Mustang and Thunderbird, with 106 Fastback and 39 convertibles.
Taxonomy: Faceted searches are divided into categories based on predefined taxonomy. Know taxonomy ad industry specific taxonomies: Library of Congress Dewey Decimal System; pharmaceutical taxonomy’s; Biology Kingdom file and class order family gene and species. This allows people to get progressively closer and closer to interesting data. Perfect Search supports faceted search and taxonomy in a very innovative way. Our demos show how we’re able to provide powerful taxonomies and faceted search to effectively at quickly find needed data. 

Back to top

What is Federation? What is Hyper-Federation?

What is Federation? What is Hyper-Federation?

Enterprise search vendors use this term to define searching across multiple, disparate data sources. Organizations have information stored in data base, Exchange, Lotus Notes, SharePoint (Intranet) and emails. Search vendors sell features for federation at varying levels of supplication. Different silos of information have different security and different types of response times. Different credentials require different response. Submitting a federated query means I am asking the same question at the same time to all silos. Federated queries reach across to all silos. Your query response is returned in a combined fashion, although you will only be as fast as the slowest silo. A data base may be easy and less expensive, verses searching an email silo which would be more expensive. Perfect Search provides hyper-federation, which eliminates the security problem.  It also eliminates the performance problem. Because Perfect Search scales so much higher than traditional search technologies, it delivers a significant benefit in the amount of data and the number of data sources that can be incorporated.

Back to top

What is Search Latency?

What is Search Latency?

When you do a search, you expect results to be from all current available data. Our assumption is that there’s synchronization between the index that’s been built and actual data.  Between the time data is created and indexed, and the time you do a search, it’s possible that data will change. Search technology tries to drive down this time—decreasing the latency. Perfect Search can index quickly and drive down latencies to a matter of seconds, as opposed to minutes or hours.

Back to top

Difference Between a Boolean and Bayesian Search

Difference Between a Boolean and Bayesian Search

Bayesian Search is based on a statistical technique that looks at the overall data set and decides what the best matches are to a query based on statistical criteria. This can be very useful.  However, it has a fatal weakness even with a minor change to the data set. Even by adding one record, you need a new statistical analysis of the entire data set before you can decide whether something still matches or is still the best match out of a set.  It’s common for a company to have a large corpus add to the records: new skews, new records, support issues—statistical profile of their data set is constantly changing that makes it very difficult to update a corpus and an index in real-time.

Boolean Search is based on simple criteria where you receive either a thumbs up or a thumbs down.  Either something matches or it does not.  For example, search for all the documents related to apples and fruit in general.  Either a document relates or it does not based on the Boolean criteria of having the word apple or fruit, and you don’t have to consult the rest of the corpus.  As a result, when you add a new document, the value or the ranking of the document that you have already seen does not change.

Back to top

Taking Advantage of Computer Hardware for Faster Searching

Taking Advantage of Computer Hardware for Faster Searching

Because of our core algorithm, Perfect Search is inherently fast with index housed in disk. With index housed in memory, we are extremely fast. Perfect Search takes advantage of hardware in a way that other systems are unable to do. If Perfect Search has 10 queries per second (QPS), 10K drive, and let’s say we want another 20 QPS, we simply add another 10K drive we will be close to double the speed.  Most systems don’t get that kind of uptick. If we RAID multiple drives together, we get similar speed increases. SAS Systems; multiple drives; solid state drives—we take advantage of these drive unlike any other enterprise search technology solution. Solid state drives can achieve up to 100,000 512K byte random reads as compared to a disk drive that gets about 400 or 500 512K bytes of random reads.  We take advantage of multiple processors and multiple threads as we crawl, index and search. In addition to our algorithms, all these parallelisms work with our system and software.  At the end of the day, these various caching and efficiencies methods enable a massive amount of data to be processed and searched efficiently.

Back to top

Indexing of Massive Data Sets

Indexing of Massive Data Sets

On the indexing side of our technology Prefect Search it has a pipeline architecture that digests information to be indexed in small pieces and in small incremental steps. This enables us to maintain a constant speed of indexing.  It uses a lot of temporary files and has steady index creation speed. We are able to merge together indexes as needed.  We don’t have to merge them immediately into one huge index.  Perfect Search can deal with them in several different pieces then, at out leisure, merge them together still achieving impressive speed even though we are processing multi indexes. Our philosophy of indexing uses not just individual words or terms, but what we call molecules of information which may involve combinations of terms and different fields and locations for those terms.  A bolded term is an example.  These molecules of information result in shorter location lists which enable greater speed when processing.  When we have large reference lists, such as all the people born in 1910 in a genealogy data base, we are able to have methods to search the list as though it were a small list, thus dramatically increasing the search speed. In memory we have files that are called accelerators that are approximately 5% of the size of the main index.  These accelerators enable Perfect Search to increase the speed of operations, with the main index housed on disk.

Back to top

How Did Perfect Search Begin?

Searching Indexed Massive Data Sets

How can Perfect Search handle the search of massive data sets once they have been indexed? Perfect Search is able to search multi indexes simultaneously even though they have not been merged together.  This enables the indexing process to be simpler. Indexes can be on disk and do not have to be in memory or cached. Perfect Search accesses indexes with our accelerators files.  If you have a GB of data, the index coming off the disk, or 100 GB index coming off the disk, the access is comparable. Perfect Search uses a micro search approach.  The traditional approach is to have a relevance bubble up method.  This means that the very last term and the very last reference of that term may bubble up to become highest ranking search.  Everything has to be considered.  You may have some terms that are expansive, such as all the people born in 1910, all the people that died in 1960, or all the people living in Washington State.

Perfect Search’s approach is quite different.  We decide on the exact specification on the best search.  For example, this could be a particular topic in the title of the document such as an exact phrase.  We then go and search for that exact item.  If enough hits are achieved the search is complete, otherwise we drill down to the next relevance.  Perfect Search must be and is extremely fast on these small micro searches.   If we have completed 100 micro searches, we must be extremely fast if we are going to be faster than the bubble up approach that is traditionally used. We will avoid reading bytes from the index whenever possible, using accelerator tables. If we’re using these tightly specified searches then it may not exist anywhere.  That’s ideal.  Perfect Search then immediately drops the search and moves on to the next possible search.  For example, in the end we do various studies. The fact that we are going faster it is obvious simply by looking at how many bytes were actually read off the disk.  Because of our approach, our systems may read 500,000 bytes off the disk. We may jump around but are very frugal in where we read.  Other systems may be reading 500,000,000 bytes of the disk.  It’s obvious why one is much faster than the other.

Another very important point is that once a particular search has been cached in memory, it has already been read from the disk, and we can process it sometimes 100 or even a 1000 times faster. Our micro search architecture may have an earlier search that is going to use a common term with a later search.  So if the common term has already been read before, it will go much faster. We take advantage of the hardware and software caching.

Back to top

How Did Perfect Search Begin?

How Did Perfect Search Begin?

Perfect Search Co-founder and Chief Scientist Ron Millett explains the history of Perfect Search. Ron’s expertise is in search technology starting over 25 years ago. He went to WordPerfect and then Novell and developed a system called QuickFinder which was incorporated into the first word processor and search engine.  This was WordPerfect 5.2.  An email package with search incorporated into it was call GroupWise 5.1.  Flash forward to the beginning of Perfect Search when the idea was sparked that centered around the fact that the speed of retrieval had not progressed for many years. Ron knew there must be a way to increase search speed rather than having proliferating farms of computers and servers.  As Ron started looking into this and working with his good friend Bruce Tietjen, they pressed ahead with the idea of searching more by molecules rather than by atoms, to use an analogy. Perfect Search’s foundational concepts were born.   

Ron then started working with a Dillion Inouye, a Stanford PhD who was a professor at BYU and entrepreneur involved in Folio Corporation who at one point had quite significant part of the search market searching on CD rom’s and technical journals. Dillon offered innovative ideas that were contrary to a computer science approach. He continued to challenge the limitations. The barriers were dismantled, and they were led to a method which was dubbed hyperspace engine of search.  Today Perfect Search is using Hyperspace III.  With the help of his partners, Ron’s performance projections that seemed outlandish at the time, in many cases ended up being absolutely correct. This is why he is so excited about Perfect Search and its game-changing technology.

Back to top

Searching is Like Harvesting Wheat

Searching is Like Harvesting Wheat

Perfect Search Co-founder and Chief Scientist Ron Millett explains that a friend of his, Bruce Tietjen, and he had about 3/8 of what they eventually needed for the innovations of Perfect Search to be a viable technology. Dillon Inouye, a PhD from Stanford University, was an entrepreneurial and a Professor at BYU.  Dillon had a family farm in Central Utah. It was while sitting at this farm watching the combine harvest wheat that he had his light bulb moment. He said that it was as he had known it all of his life.  He was struck with the analogy between the combining process and that of enterprise search.

Back to top