Comments on: SQL: finding a user’s country/region based on IP https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip Blog by Shlomi Noach Wed, 16 Dec 2009 07:11:00 +0000 hourly 1 https://wordpress.org/?v=5.3.3 By: Joshua K Roberson https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2948 Sun, 02 Aug 2009 04:32:26 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2948 Here are different ways to store an IP or an IP range as well as query for the IPs.
http://strictcoder.blogspot.com/2009/08/different-ways-to-query-for-ip-in-your.html

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2113 Thu, 28 May 2009 03:13:33 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2113 @Michael,

Great! Thanks for the benchmarks!

]]>
By: Michael https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2110 Wed, 27 May 2009 21:02:58 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2110 @shlomi,

D’oh! Yes, you’re completely right. I wasn’t reading sql statement correctly, and there was a typo in it (using >= instead of <=) in my test cases. (Note to self, do not comment until after coffee has been had.)

I did do some benchmarking on some stuff here.

The winner is definitely the “SELECT * (SELECT * .. WHERE my_ip <= ip_begin..) AS T …” at least on my box configuration. – approx 5K queries/sec (single client)

The worse is of course, the “ip BETWEEN start and end” at a whole whopping 3 queries/sec, and that’s with the optimizer claiming to use start ip index.

My version using classb was able to crank out 2K queries/sec. Not bad, but only half as good as the SELECT(SELECT *) AS T.. one.

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2109 Wed, 27 May 2009 19:29:45 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2109 @Michael,

I disagree. The ‘229512’ rows reported in first version’s subquery does not really hold. That’s what’s peculiar about EXPLAIN: it does not (maybe can not?) take the LIMIT into consideration. I assure you (by testing it and seeing the performance, that is) that the number reported is not actually the number of rows to be scanned in this plan. The real number is 1.

Please notice that this is regardless of the subquery: see the EXPLAIN plan discussion within my post.

The second step involved is really nothing, since it only needs to scan a table of a single row. So no real impact on memory or CPU here.

Whereas in the second version you’ve provided, and as I explained in my post, 229512 may actually be a reasonable estimation for number of rows to scan.

Regards

]]>
By: Michael https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2108 Wed, 27 May 2009 17:15:12 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2108 @jason

Unfortunately, that expression is worse. The sub-select will grab ALL rows above the starting IP, create a derived table, and then scan through that derived table which *does not* have an index. The lower the IP number is, the more rows that will have been dumped into the derived table.

Under MySQL, derived tables do not inherit any indexes from the original table, or tables, it was created from. It would be awesome if it did, but it doesn’t.

The MySQL optimizer ain’t too happy about it:

explain SELECT * FROM (select * from geo_ip_blocks WHERE 412306898 >= ip_begin ORDER BY ip_begin limit 1) AS T where 412306898 <= ip_end\G

*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Impossible WHERE noticed after reading const tables
*************************** 2. row ***************************
id: 2
select_type: DERIVED
table: geo_ip_blocks
type: range
possible_keys: ip_begin,index_on_ip_begin,classb_begin
key: ip_begin
key_len: 4
ref: NULL
rows: 229512
Extra: Using where

Now look at the following SQL which is what we all naturally want to try first when attacking this problem:

explain SELECT * FROM geo_ip_blocks WHERE 412306898 BETWEEN ip_begin AND ip_end\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: geo_ip_blocks
type: range
possible_keys: ip_begin,ip_end,index_on_ip_begin,classb_begin
key: ip_begin
key_len: 4
ref: NULL
rows: 229512
Extra: Using where

In both cases the estimated number of rows to scan is the same: 229512. The first SQL statement is going to be worse since there’s a second step involved with having to create a derived table.

The second SQL only has one action involved, which is exactly the same as the one in the first SQL.

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2101 Wed, 27 May 2009 08:40:00 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2101 @Jason,

This seems right to me, and was what I meant… 🙂

]]>
By: Jason Stubbs https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2100 Wed, 27 May 2009 06:52:10 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2100 I didn’t consider the case of a no-match when I attacked this problem myself way back when. Although the SQL starts to get a little ugly, using a sub-query should fix the performance hit there too.

SELECT * FROM (
SELECT * FROM regions_ip_range
WHERE my_ip >= start_ip
ORDER BY start_ip LIMIT 1
) AS t
WHERE my_ip <= end_ip;

I haven’t tested this performance-wise but it seems right theoretically… Either way, your spot on both of your “wrong ways”. 🙂

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2099 Wed, 27 May 2009 06:23:08 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2099 @Michael,
very nice!
Out of curiosity: is there no way a range consists of more than one classB?

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2098 Wed, 27 May 2009 06:20:39 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2098 @PB – #9
suspiciously identical comment to Jason #3??
Anyway, see my comment, #4

]]>
By: Michael https://shlomi-noach.github.io/blog/mysql/sql-finding-a-users-countryregion-based-on-ip/comment-page-1#comment-2097 Wed, 27 May 2009 06:14:44 +0000 https://shlomi-noach.github.io/blog/?p=705#comment-2097 FYI – the geo_ip_blocks table has 4M rows.

]]>