Wednesday, 11 July 2012

Search Engines & Information Intelligence | Research Keys, Part 1

Research Keys is a series of blog posts which will cover essential resources on search engines and information intelligence; directories and sites for kids, K-12 and university students; global search sites for jobs and travel; multimedia search engines; online library guides; content-rich subject guides by educators and professionals; resources by foundations, universities and governments; open access repositories; content curation sites; similar sites - their application in research; free Excel tutorials; global and European research scholarship websites. Although the focus is clearly on research, many of the resources are useful for business and individual targeted search (e.g. competitive intelligence sites; sites for jobs and travel; library guides and content-rich websites provide information in many areas and the websites are freely accessible; Excel resources)

Research Keys, Part 1 starts with a comprehensive collection of key search engine resources and guides – how to search effectively the internet; where to find topical search engines, dictionaries, news and marketing columns. Semantic search engines, deep web tutorials, information and competitive intelligence, big data and knowledge maangement resources are also on the list.

In order to get the most of the available online information, more than one search engine should be used. While semantic search engines will narrow the search and give specific information, meta search engines will broaden it and will return unexplored, often unexpected sources. Spending a lot of time looking for meaningful information against a particular query is not uncommon for searchers. The current business model of search engines (sources of income) and the indexing and ranking mechanisms (how authority and quality are defined) pose problems to search quality. However, not knowing where and how to search remains the biggest challenge (which probably prompted Google to launch a search training site for students including a Google power searcher course that just started, and resources for educators) In addition, there is information on the web which is freely available but hidden (dynamic vs. static url) in databases (see Deep Web section).

The combined use of all search services – engines (crawler-based), directories (human-powered) and databases will yield efficient and hopefully effective results. The list below and the whole Research Keys series is a good starting point to exploring content-rich resources relevant for your research.

Search Engines: History & Resources 

Search Engine History - a brief guide on the history of internet; some basic web terms explained; key thematic resources listed

How Search Engines Work - a guide (a freely downloadable pdf) published on Search Engine Watch by Mike Grehan (global content director with Incisive Media), excerpted from his 2002 publication Search Engine Marketing: The Essential Best Practice Guide. A description of the various “modules’’ of a search engine is followed by interviews with technology experts from Alta Vista, WebCrawler and Google - issues such as the vector space model, search engines business model and the page rank technology are touched upon.

Guide to Effective Searching of the Internet (a pdf of 68p.) by Michael Bergman published on the Bright Planet site. Through a particular search question to be resolved via the web, the author provides guidelines for an optimized search query using Boolean operators, and then discusses the various search services (directories, deep web tools).

Resources for Searching the Web by Acq Web - several tutorials about searching and understanding the Internet, web directories and search engines listed; and some more published on the old Acq Web website

SE articles and guides published on the directory: How to Properly Research on the Internet  the 10 Best Search Engines for Beginners based on readers suggestions; search engines and directories A-Z; and SE by categories; the Best Reference Sites Online, links to many related pages at the bottom of the page; all articles on search engines published on the website

Links to 3 Tutorials on Searching the Internet by the Search Engine Guide 

Online video lectures from the UC Berkeley 2005 Course Search Engines: Technology, Society, and Business; and presentations. Prof. Hearst's publications are freely available on her website and her 2009 book Search User Interfaces can be read online

Search Engine Colossus: International Directory of Search Engines – search engines by country, language and by category (on the right hand side)

Wikipedia’s List of Search Engines - SE by category, geographical area, databases, desktop search tools; a link to search engines indexed by the Open Directory Project (DMOZ); and all pages in the category 'Internet Search Engines'

Results returned for the query “Search Engines” by the LibGuides Community

Search Engines listed by the Search Engine Guide 

The Search Engine List - a topical list of Search Engines

Metacrawlers and Metasearch Engines - a compilation by the Search Engine Watch 

24 Metasearch Engines for Centralized & Efficient Searching by the Search Engine Journal (2008)

Search engines (general, specialty, international) by the Search Engine Watch and links to other sites which provide lists of search engines

Pandia - a site dedicated to search engines and productive Internet searching - search trends, tutorials and links to many other SE and information intelligence resources including non-English sites

SE Resources by Pandecta 

Topical Search Engines based on Pandecta’s Search Engine Yearbook 2003

The Big Search Engine Index - a UK-based directory of engines by category and region; some sites are no longer active

35 Interesting but Lesser Known Facts about Google, the Search Engine King by Tech Chunks 

Webopedia - online computer dictionary

Search Engine Dictionary, a project by Pandecta Magazine - a glossary of SE terms

Search Engines: News, Analysis & Marketing 

The official Google Search Blog - search highlights are announced on a regular basis; see also Google Scholar blog as well as other official blogs in the Google Blogs Directory; and Google updates on YouTube

Search Research - a blog about effective Internet search by Daniel Russell, employee at Google (the blog was mentioned by Info Today)

Open Source Enables High-Volume Searches an article (May 2012) written by Stacy Collett for Computer World; here are all their articles on applications and internet search

Your Resume Getting Past the Machines an article written (June 2012) by Ken Moore for Computer World; more articles on IT Careers

Search Engine Showdown – news, strategies, analysis

Search Engine Watch - tips and information about searching the web, analysis of the search engine industry

Search Engine Land - a news and information site covering search engine marketing (a Search Cap daily newsletter), searching issues and the search engine industry; Guides to Google, Bing and Yahoo

Search Engine Journal - SEJ spotlights the important trends, news, strategies and personalities in the industry

Search Engine Guide: The Small Business Guide to Search Marketing - SE news, marketing and a directory of search engines by topics

Search Engine Optimization (SEO) sections of SE related sites provide tips on site visibility and ranking for search engines. Common SEO issues address the use of keywords, meta tags, link building. Google has provided some resources for a Google-friendly site, crawling, indexing and ranking issues - see the Google Webmaster Tools site and blog; you may find useful Google’s SEO Starter Guide and check Google Webmasters Q&A on YouTube

Semantic Search Engines 

Semantic SE are also called Web 3.0 Search Engines – Google and Bing have to some extent semantic features but Semantic SE focus entirely on meaning and relation of words rather than combination of keywords used by traditional SE. Semantic SE show a list of possible meanings, one can ask questions, link and compare data and generate lists of things (e.g. Kngine), summaries (e.g. SenseBot) or see highlighted phrases and sections (e.g. Cognition). Read carefully the SE manuals in order to enter correctly your query. Some engines from the below articles might not exist any longer, e.g. Powerset and Kosmix were acquired by Bing and Walmart Labs, respectively. Other engines use semantic technology but are different in their own way, e.g. Quixey - the search engine for applications - has developed a functional search based on machine learning approach, Wolfram Alpha is a computational knowledge engine while Evri is a content discovery engine, a mix of social curation and topical streams.

Publications on the Semantic Web by the World Wide Web Consortium (W3C)

Top 5 Semantic Search Engines - an article (2009) by Pandia

Top 7 Semantic Search Engines as an Alternative to Google - an article (2010) published by MakeUseOf

9 Semantic Search Engines That Will Change the World of Search - an article (2009) by the Search Engine Journal

News on semantic search engines published on the news section of the semantic SE Sensebot - here you will also find other semantic SE

What is Semantic Search? - 10 Things that Make Search a Semantic Search, an article published on the website of the SE Hakia

5 Ways Semantic Search Will Disrupt Business, an article (June 2012) by Forbes

How Google and Microsoft taught search to "understand" the Web an article (June 2012) by Ars Technica

Quixey: a search engine for the apps era (some other apps SE are mentioned) an article (2012) by the Search Engine Land

SEO for Semantic Search Engines - an article (2008) by the Search Engine Land

Information Intelligence & Competitive Intelligence 

Competitive Intelligence – Get Smart! - an article (1998) by Fast Company in which information industry experts discuss internet tools, positive and negative aspects of firms' web presence for search needs and market competitiveness

Competitive Intelligence - an article with references by Reference for Business, Encyclopedia of Management; and an article on the same topic by Reference for Business, Encyclopedia of Business, Advameg encyclopedias

Competitive Intelligence - an article by the Free Management Library with a number of resources including links to an online course – pdf files by the Financial Management Training Center

The Virtual Private Library - developed by Marcus Zillman, e Solutions Architect & Internet Expert, and powered by Subject Tracer Information Bots – papers containing many useful links to search engines. Some of his freely downloadable research white papers Searching the Internet; student research resources; scholar search engines and sources; business intelligence resources; see also his Tutorials and Directories compilations; research guides by subject; healthcare, business and legal resources; data mining resources; military resources; employment resources, etc. New resources are added on a regular basis using bots, blogs and news aggregators; you may subscribe to his monthly awareness watch newsletters 

FreePint Products for finding and managing information - library and web-based information news, government reports, business information mining and intelligence; some services are subscription-based, others are free. They have compiled (June 2012) a list of selected resources for competitive intelligence in biopharma 

Fuld & Company: The Global Leader in Competitive Intelligence – a research and consulting firm in the field of competitive intelligence. The Internet Intelligence Index (links to more than 600 sites including general business resources; various industry sites; international internet resources) and the Pharma-Healthcare Internet Intelligence Index (more than 250 sites in the pharmaceutical and healthcare industries; disease information, healthcare associations, databases, journals and news, gov agencies, brand management) are some of the freely downloadable papers from their resource center; check their blog for updates

Info Today - products and news for librarians and information industry professionals, many blogs and resources are available

Various articles on CI can be found in the Academy of Competitive Intelligence (ACI) resource center

Open Resources on Competitive Intelligence provided by the Strategic and Competitive Intelligence Professionals

Deep Web Resources 

Deep Web University tutorials by the Bright Planet, a company specialized in deep web intelligence (the source was quoted by M. Zillman)

Deep Web FAQ by the Bright Planet

The Deep Web: Surfacing Hidden Value - Bright Planet's white paper written by Michael Bergman in 2001  

Some key insights from Michael Bergman’s paper are that the deep web is the largest growing category of new information on the Internet; that Google indexes only 16 % of the surface web while content in the deep Web is nearly 500 times greater than that visible to conventional search engines; that topical databases account for more than 50 % of the Deep Web content; that 95 % of the deep Web content is publicly and freely accessible (Bergman 2001). The author points some limitations of the surface web and the importance of being aware of the Deep Web; Bright Planet’s search technology is retrieving both "deep" and "surface" content; analysis of largest deep web sites including a table of 60 sites such as academic databases and US government portals. 

Bright Planet has provided a link to the content-rich website Web Search Guide where many tutorials and updates on searching the Internet can be found 

10 Search Engines to Explore the Invisible Web - an article written by Saikat Basu for MakeUseOf (quoted by Bright Planet) 

Research articles about the web and a compilation of key invisible sites and databases by John Royce, Robert College library director - Webliography prepared for the NESA Administrators 2001 Conference; and the updated list for the 2002 conference; more bibliographies (2003-2004) and presentations available here

Free Federated Search Engines for Scientific Data and Literature (Nov 2011), an article by the Intellogist, a blog maintained by patent information professionals 
Deep Web Search Engines by, a website maintained by McCartney Taylor 

Deep Web Technologies is a federated search provider. They have made available online white papers and presentations on the deep web & federated search; see also their two blogs – Deep Web Tech and a blog on federated search covering resources and books on search, industry news, conference updates, etc. ; here is a post about 15 articles on federated search 

Some Wikipedia pages and wikis (intranets) may be considered deep web content because despite the fact that there are a number of Wikipedia internal and external search tools, subcategories are not well indexed. A Wiki which I find to be useful is the UBC HealthLib Wiki - a wiki with portals on health librarianship, social media and a range of information technology topics curated by Canadian health librarians. Here are some of their resourceful health-related pages: (1)Health specific search tools, (2)Bioinformatics, (3)Drug Information – print & online sources, (4)Medical podcasts and videocasts, (5)Health 2.0 (including social networking resources)  - web 2.0 tools for patients’ independent or additional use; and here are some general (not only health-related) library & scholar pages: (1)Open Search Tools, (2)Open Data, (3)Google Scholar bibliography, (4)Library 2.0 Bibliography, (5)Social Media Aggregators

Big Data & Knowledge Management (KM)

What is KM? Knowledge Management Explained an article (May 2012) by Michael Koenig, KM World Magazine

KM World Best Practices White Papers - KM, Information Governance, Intelligent Search

KM World: 100 Companies That Matter in Knowledge Management (2012)

The knowledge movement: trends and opportunities (India), an article published in April 2012 by KM World Magazine

FreePint: Information overload: fact, fantasy or filter failure? (May 2012)

Managing information, opportunities and the role of info pros, an article (Feb 2012) by FreePint (link to a McKinsey report on big data)

What is big data? An introduction to the big data landscape, an article (Jan 2012) written by Edd Dumbill for O'Reilly Media (also mentioned by FreePint)

What is Big Data, an article (May 2012) by KM World Magazine

Data & Insight by O’Reilly Radar – focus on technology, media, healthcare, government

Big Data's Big Problem: Little Talent, an article (April 2012) written by Ben Rooney for the Wall Street Journal

Big Data In 2020: More Info, More Problems, an article (July 2012) by Fast Company about a survey on big data carried out by the Pew Research Center and Elon University

The post was updated on 31/07/2012

No comments:

Post a Comment

Note: only a member of this blog may post a comment.