{"id":1245,"date":"2009-09-07T10:23:21","date_gmt":"2009-09-07T08:23:21","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=1245"},"modified":"2009-09-07T13:58:08","modified_gmt":"2009-09-07T11:58:08","slug":"sphinxse-0-9-9-rc2-bug-workaround","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/sphinxse-0-9-9-rc2-bug-workaround","title":{"rendered":"SphinxSE 0.9.9-RC2 bug workaround"},"content":{"rendered":"<p>There is a serious bug with the sphinx storage engine, introduced in 0.9.9-RC2 (and which has not been fixed in latest revisions, as yet &#8211; last checked with rev 2006).<\/p>\n<p>I would usually just revert to an older version (0.9.9-RC1 does not contain this bug), but for the reason that RC2 introduces an important feature: the <strong>sphinx_snippets()<\/strong> function, which allows for creation of snippets from within MySQL, and which makes the sphinx integration with MySQL complete, as far as the application is concerned.<\/p>\n<h4>The bug<\/h4>\n<p>The bug is described <a href=\"http:\/\/sphinxsearch.com\/forum\/view.html?id=3589\">here<\/a> and <a href=\"http:\/\/sphinxsearch.com\/forum\/view.html?id=4081\">here<\/a> (and see further discussions). Though it&#8217;s claimed to have been fixed, it&#8217;s been re-reported, and I&#8217;ve tried quite a few revisions and verified it has not been fixed (tested on Debian\/Ubuntu x64).  <span>Essentially, the bug does not allow you to set filters on a query issued from within the SphinxSE. For example, the following queries fail:<\/span><\/p>\n<blockquote>\n<pre>SELECT ... FROM ... WHERE query='python;mode=any;sort=relevance;limit=200;range=myUnixTimestamp,1249506000,1252184400;'\r\nSELECT ... FROM ... WHERE query='python;mode=any;sort=relevance;limit=200;filter=my_field,1;'<\/pre>\n<\/blockquote>\n<p>While the following query succeeds:<\/p>\n<blockquote>\n<pre>SELECT ... FROM ... WHERE query='python;mode=any;sort=relevance;limit=200;'<\/pre>\n<\/blockquote>\n<p>The error message is this:<\/p>\n<blockquote>\n<pre><span>ERROR 1430 (HY000): There was a problem processing the query on the foreign data source. Data source error: searchd error: invalid or truncated request<\/span><\/pre>\n<\/blockquote>\n<p><span>I see this as a serious bug in the SphinxSE: it renders it useless; searching without the ability to filter is not something I can live with.<!--more--><\/span><\/p>\n<h4><span>The motivation<\/span><\/h4>\n<p><span>Sphinx does not store the actual text content. To get search results with snippets, you need to:<\/span><\/p>\n<ul>\n<li><span>Ask sphinx for the documents ids<\/span><\/li>\n<li><span>Get the content for those documents<\/span><\/li>\n<li><span>Ask sphinx for snippets based on the provided content and search phrase.<\/span><\/li>\n<\/ul>\n<p>With the introduction of the <strong>sphinx_snippets()<\/strong> function, this can all be done with a single query, like this:<\/p>\n<blockquote>\n<pre>SELECT my_docs.my_docs_id,\u00a0 my_docs.publish_time,\u00a0 CONVERT(sphinx_snippets(my_docs.id.content, 'my_docs_index', 'python') USING utf8) AS snippet\u00a0 FROM tets.my_docs INNER JOIN test.my_docs_sphinx USING(my_docs_id) WHERE query='python;mode=any;sort=relevance;limit=200;range=publish_time_unix,1249506000,1252184400;';<\/pre>\n<\/blockquote>\n<p><span>This is really a life saver; without this function, you need to get the resutls back to your application, then send the data again to MySQL, in which case you might altogether discard the SphinxSE and talk to sphinx directly. But with a single query you get to ask the results just as if you were asking for any result set from your database (with extra syntax).<\/span><\/p>\n<h4><span>The workaround<\/span><\/h4>\n<p><span>My setup is Percona&#8217;s <\/span><strong>mysql-5.1.34-xtradb5<\/strong> source, on Ubuntu server <strong>8.04 amd64<\/strong>. The trick is to first compile MySQL with sphinx <strong>0.9.9-RC2<\/strong>, in order to produce the <strong>sphinx.so<\/strong> file (where the <strong>sphinx_snippets()<\/strong> function is found), backup the <strong>sphinx.so<\/strong> file, then recompile everything with <strong>sphinx 0.9.9-RC1<\/strong>. The steps being:<\/p>\n<p>Compile MySQL with sphinx <strong>0.9.9-Rc2<\/strong> (I choose to install MySQL on <strong>\/usr\/local\/mysql51<\/strong>):<\/p>\n<blockquote>\n<pre>tar xzfv mysql-5.1.34-xtradb5.tar.gz\r\ncd mysql-5.1.34-xtradb5\r\ncp -R \/tmp\/resources\/sphinx-0.9.9-rc2\/mysqlse storage\/sphinx\r\nsh BUILD\/autorun.sh\r\n.\/configure --with-plugins=innobase,sphinx --prefix=\/usr\/local\/mysql51\r\nmake<\/pre>\n<\/blockquote>\n<p>This produces the <strong>sphinx.so<\/strong>, <strong>sphinx.so.0<\/strong>, <strong>sphinx.so.0.0.0<\/strong> files. Back them up!<\/p>\n<p>Next, recompile with sphinx <strong>0.9.9-RC1<\/strong>. I&#8217;ve found that simple copying and recompiling doesn&#8217;t work well. So just cleanup everything and start afresh:<\/p>\n<blockquote>\n<pre>cd ..\r\nrm -rf mysql-5.1.34-xtradb5\r\ntar xzfv mysql-5.1.34-xtradb5.tar.gz\r\ncd mysql-5.1.34-xtradb5\r\ncp -R \/tmp\/resources\/sphinx-0.9.9-rc1\/mysqlse storage\/sphinx\r\nsh BUILD\/autorun.sh\r\n.\/configure --with-plugins=innobase,sphinx --prefix=\/usr\/local\/mysql51\r\nmake\r\nsudo make install<\/pre>\n<\/blockquote>\n<p>Copy the <strong>sphinx.so<\/strong> files into the MySQL plugin directory (<strong>\/usr\/local\/mysql51\/lib\/mysql\/plugin<\/strong> in our case).<\/p>\n<p>Then build sphinx (you must have MySQL includes for sphinx to compile, so this must be the second step):<\/p>\n<blockquote>\n<pre>cd \/tmp\/resources\/sphinx-0.9.9-rc1\/\r\n.\/configure --prefix=\/usr\/local\/sphinx --with-mysql=\/usr\/local\/mysql51\r\nmake\r\nsudo make install<\/pre>\n<\/blockquote>\n<p>Essentially, we&#8217;re working now with <strong>0.9.9-RC1<\/strong>, but the <strong>sphinx_snippets()<\/strong> function is from the <strong>0.9.9-RC2<\/strong> version, and happily no one bothers about this mix.<\/p>\n<p>I hope this helps.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There is a serious bug with the sphinx storage engine, introduced in 0.9.9-RC2 (and which has not been fixed in latest revisions, as yet &#8211; last checked with rev 2006). I would usually just revert to an older version (0.9.9-RC1 does not contain this bug), but for the reason that RC2 introduces an important feature: [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[10,44],"class_list":["post-1245","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-installation","tag-sphinx"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-k5","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1245","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=1245"}],"version-history":[{"count":19,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1245\/revisions"}],"predecessor-version":[{"id":1261,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/1245\/revisions\/1261"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=1245"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=1245"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=1245"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}