The YaCy API: external access to the YaCy search server

YaCy can be embedded into any web page using the search API presented here. Using the YaCy-API, it is also very easy to integrate YaCy into other programming environments like php or perl.

Data export with XML: integration of search results into other applications

Search results can be retrieved in JSON and RSS format as specified by opensearch.org. This lets YaCy search results be easily integrated with any RSS client library. For example:

> curl http://localhost:8090/yacysearch.rss?query=foaf&maximumRecords=10

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='/yacysearch.xsl' version='1.0'?>
<rss version="2.0"
  xmlns:yacy="http://www.yacy.net/"
  xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/"
>
<!-- stark verkürztes Beispiel! -->
  <item>
    <title>Friend of a Friend (FOAF) project</title>
    <link>http://www.foaf-project.org/</link>
    <pubDate>Fri, 23 May 2008 02:00:00 +0200</pubDate>
  </item>
  <item>
    <title>FOAF - Wikipedia</title>
    <link>http://de.wikipedia.org/wiki/FOAF</link>
    <pubDate>Tue, 08 Jan 2008 01:00:00 +0100</pubDate>
  </item>
  <item>
    <link>http://microformats.org/wiki/xfn-to-foaf</link>
    <pubDate>Fri, 09 May 2008 02:00:00 +0200</pubDate>
  </item>
</rss>


Data import with XML: 'Surrogates'

External data sources can easily be loaded into YaCy: the data must be formatted as XML with Dublin Core Metadata XML Data containers. The xml files are imported and indexed automatically if they are stored into a hand-over directory (DATA/SURROGATES/in). Here an example for a 'surrogate'-file:

<?xml version="1.0" encoding="utf-8"?>
<!-- YaCy surrogate using dublin core notion -->
<surrogates xmlns:dc="http://purl.org/dc/elements/1.1/">
  <record>
    <dc:title><![CDATA[Alan Smithee]]></dc:title>
    <dc:identifier>http://de.wikipedia.org/wiki/Alan_Smithee</dc:identifier>
    <dc:description><![CDATA[Der als Filmregisseur oft genannte '''Alan Smithee''' ist ein Anagramm von „The Alias Men“.]]></dc:description>
    <dc:language>de</dc:language>
    <dc:date>2009-04-14T00:00:00Z</dc:date> <!-- date is in ISO 8601 -->
  </record>
</surrogates>


Integration into wikis, forums and blogs using the search widget

YaCy provides ready-to-use code snippets to be integrated into the html code of any web page. To load the content of special content management software like blogs, wikis and forums, YaCy has specialized harvesters. The YaCy search then works as a meta-search over your different data sources and can provide a faceted view which distinguishes your data sources in the search results.

Retrieval of the web page link structure

The link structure of web domains can be visualized and also exported as xml data and may be interesting for web page designers.

> curl http://localhost:8090/api/webstructure.xml?about=yacy.net


<webstructure maxhosts="20000">
  <references direction="out" count="1" maxref="300">
    <domain host="yacy.net" id="Fh1hyQ" date="20090618">
      <reference id="VRAHIA" count="5">suma-ev.de</reference>
      <reference id="EMaLDQ" count="3">www.kit.edu</reference>
      <reference id="sX4ozA" count="15">liebel.fzk.de</reference>
    </domain>
  </references>
  <references direction="in" count="1">
    <domain host="yacy.net" id="Fh1hyQ" date="20090618">
      <reference id="a_bYbR" count="32">de.wikipedia.org</reference>
      <reference id="DWDqhA" count="1">hwiki.fzk.de</reference>
      <reference id="4JR9RA" count="1">wiki.yacy.de</reference>
      <reference id="wqcWfA" count="1">www.itgrl.de</reference>
      <reference id="P290EA" count="128">www.heise.de</reference>
      <reference id="z4bRCA" count="1">blog.suma-ev.de</reference>
      <reference id="sX4ozA" count="5">liebel.fzk.de</reference>
      <reference id="FXg39Q" count="3">www.yacy.net</reference>
    </domain>
  </references>
</webstructure>