YaCy - The Peer to Peer Search Engine: API, external access to the YaCy server

The YaCy API: external access to the YaCy search server

YaCy can be embedded into any web page using the search API presented here. Using the YaCy-API, it is also very easy to integrate YaCy into other programming environments like php or perl.

Data export with XML: integration of search results into other applications

Search results can be retrieved in JSON and RSS format as specified by opensearch.org. This lets YaCy search results be easily integrated with any RSS client library. For example:

> curl http://localhost:8090/yacysearch.rss?query=foaf&maximumRecords=10

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='/yacysearch.xsl' version='1.0'?>
<rss version="2.0"
  xmlns:yacy="http://www.yacy.net/"
  xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/"
>
<!-- stark verkürztes Beispiel! -->
  <item>
    <title>Friend of a Friend (FOAF) project</title>
    <link>http://www.foaf-project.org/</link>
    <pubDate>Fri, 23 May 2008 02:00:00 +0200</pubDate>
  </item>
  <item>
    <title>FOAF - Wikipedia</title>
    <link>http://de.wikipedia.org/wiki/FOAF</link>
    <pubDate>Tue, 08 Jan 2008 01:00:00 +0100</pubDate>
  </item>
  <item>
    <link>http://microformats.org/wiki/xfn-to-foaf</link>
    <pubDate>Fri, 09 May 2008 02:00:00 +0200</pubDate>
  </item>
</rss>

Data import with XML: 'Surrogates'

External data sources can easily be loaded into YaCy: the data must be formatted as XML with Dublin Core Metadata XML Data containers. The xml files are imported and indexed automatically if they are stored into a hand-over directory (DATA/SURROGATES/in). Here an example for a 'surrogate'-file:

<?xml version="1.0" encoding="utf-8"?>
<!-- YaCy surrogate using dublin core notion -->
<surrogates xmlns:dc="http://purl.org/dc/elements/1.1/">
  <record>
    <dc:title><![CDATA[Alan Smithee]]></dc:title>
    <dc:identifier>http://de.wikipedia.org/wiki/Alan_Smithee</dc:identifier>
    <dc:description><![CDATA[Der als Filmregisseur oft genannte '''Alan Smithee''' ist ein Anagramm von „The Alias Men“.]]></dc:description>
    <dc:language>de</dc:language>
    <dc:date>2009-04-14T00:00:00Z</dc:date> <!-- date is in ISO 8601 -->
  </record>
</surrogates>

Integration into wikis, forums and blogs using the search widget

YaCy provides ready-to-use code snippets to be integrated into the html code of any web page. To load the content of special content management software like blogs, wikis and forums, YaCy has specialized harvesters. The YaCy search then works as a meta-search over your different data sources and can provide a faceted view which distinguishes your data sources in the search results.

Retrieval of the web page link structure

The link structure of web domains can be visualized and also exported as xml data and may be interesting for web page designers.

> curl http://localhost:8090/api/webstructure.xml?about=yacy.net


<webstructure maxhosts="20000">
  <references direction="out" count="1" maxref="300">
    <domain host="yacy.net" id="Fh1hyQ" date="20090618">
      <reference id="VRAHIA" count="5">suma-ev.de</reference>
      <reference id="EMaLDQ" count="3">www.kit.edu</reference>
      <reference id="sX4ozA" count="15">liebel.fzk.de</reference>
    </domain>
  </references>
  <references direction="in" count="1">
    <domain host="yacy.net" id="Fh1hyQ" date="20090618">
      <reference id="a_bYbR" count="32">de.wikipedia.org</reference>
      <reference id="DWDqhA" count="1">hwiki.fzk.de</reference>
      <reference id="4JR9RA" count="1">wiki.yacy.de</reference>
      <reference id="wqcWfA" count="1">www.itgrl.de</reference>
      <reference id="P290EA" count="128">www.heise.de</reference>
      <reference id="z4bRCA" count="1">blog.suma-ev.de</reference>
      <reference id="sX4ozA" count="5">liebel.fzk.de</reference>
      <reference id="FXg39Q" count="3">www.yacy.net</reference>
    </domain>
  </references>
</webstructure>

Integration

MetaGer

This german meta-search engine runs some YaCy peers to enrich the meta-search results with self-indexed web pages - metager.de

Client Libraries

Client-Libraries to administer YaCy-Peers using the YaCy API

ismael.pm

Perl-module to control YaCy peers - http://ismael.audioattack.de/
PHP Interface

A php client interface is described in the YaCy Wiki
OAI Access

Open Archive (OAI-PMH) Servers can be accessed with YaCy and YaCyOAI php scripts
YaCymin

An administration library for YaCy with php scripts is available: YaCymin library
Greasemonkey Script

YaCyIndexerGreasemonkey is a script to index visited websites with YaCy

The YaCy API: external access to the YaCy search server

Data export with XML: integration of search results into other applications

Data import with XML: 'Surrogates'

Integration into wikis, forums and blogs using the search widget

Retrieval of the web page link structure

Integration

MetaGer

Client Libraries

ismael.pm

PHP Interface

OAI Access

YaCymin

Greasemonkey Script