Percona Live 2013 keynotes: followup questions and discussion

April 28, 2013

Here are a few questions remained open for me from Percona Live 2013 about things that have been said during keynotes; I will appreciate a discussion on comments. Here goes:

Question #1

Brian Aker (HP) asks Simone Brunozzi (Amazon) what the underlying technology for DynamoDB is. Simone says can't disclose. Brian says: "it's MySQL!!". Simone says: "can't disclose". Brian insists: "it's MySQL!!"

Seriously? I will be very much surprised to learn that DynamoDB uses MySQL; it doesn't make sense to me. Why would Brian Aker say that though? Did he just mean to tease Simone or is there something I just don't get?

(Yes, Brian?)

Question #2

Matt Aslett speaks about adoption of MySQL & variants, and expected adoption in next years, Mentions MariaDB, Percona Server, SkySQL. KEeps saying how the SkySQL server gets more traction.

What does he mean? There's no SkySQL fork; does he mean SkySQL cloud offer? Or just SkySQL support services, typically for MariaDB variant? But in that case, SkySQL is out of context. What's going on?

Question #3

Matt Aslett presents quite pessimistic prediction for MySQL. Reduced popularity in next years. Relatively good news for MontyProgram/MariaDB; otherwise a lot of switch-off to PostgreSQL, poor adoption for Continuent, low ratios of "evaluation" to "adoption" of technologies; really quite depressing. Later mentions at about 200+ questionnaires.

I don't have a special interest here as I don't work for any mentioned company; other than my general desire to see the ecosystem flourishing.

Are 200+ people enough to both give a faithful picture of current MySQL usage and adoption? Are they enough for prediction 1 year into the future? 4 years into the future?

In Israel, with less than 8M population, election surveys usually look at 500+ people. To me it doesn't sound a lot, but statistics is not my strong skill. However those picked for the survey have to be a diverse population, distributed in similar ratio to overall population (so Jews, Muslims, Christians, Orthodox, income level, geographic location, etc.).

Does the same happen with those 200+ questionnaires in the 451 research? The reason I ask is this: at the end of keynote Matt says: "if you want your voice to be heard, if you think differently, contact me, and I'll add you to our survey". Does this mean anyone stepping up is included? Great, so a hypothetical company called GalleriaDB would encourage its 50 employees to enlist, thereby completely shifting the balance.

Who are those 200+ people in the survey? World wide known experts? Your regular DBA? Your remote DBA consultant? Your web developer? Do they represent the overall MySQL ecosystem population? Do their insights into the future collide with those of everyone?

Please discuss below

  • Great questions.

    A few prominent NoSQL systems (Voldemort, Sherpa/PNUTS) use MySQL for single-node storage. I think the big reason to do that is to use InnoDB. So I wouldn't be surprised if DynamoDB were to do the same. But this doesn't mean much to me. As soon as something better comes along MySQL/InnoDB is likely to get swapped out.

    Are the big deployments at FB, Google, Amazon 1 user or many users in such a survey?

  • Hi Shlomi,

    You have great questions.

    I also wonder who the questionnaire answerers are/who was polled. I know I wasn't. I'm guessing they are research clients (i.e people whom can afford to pay thousands for a subscription to the reports!).

    But I also noted regular mention of the word "Drizzle". Which implies that the research suggests that Drizzle is still something people consider deploying.

    I did fill up the surveymonkey link at the end though. I do hope to participate in future surveys.

  • Shlomi, I have start my career looong ago, writing procedures (on mainframe) for a company doing statistical analysis. I have being working in that field for years, writing models to read and interpret the data.

    As a matter of fact statistics needs to be taken on a well selected set of representative identities.

    When you consider a scenario like MySQL, so wildly use but not universal use (ie water consumption), covering several segments (like enterprise to single developer).
    The set cannot be less then the 10 - 15% of the total number per segment.

    Questionnaire and reported statistics, must be also differentiate by segment, given each one has a different trend, also if correlated.
    Some questions can be generic, but then you need to also have question per segment to contextualize the analysis.

    Finally the used set must be describe at the beginning (not at the end), to provide the correct information in order to have the correct in data interpretation.

    In short what was present was... nothing.

    No meaning, numbers have no sense in this way and taking that conclusions base on those number is simply nonsense.

    What you report about you provide your feedback you will be include, shows that he has no idea of what he is talking about, given another characteristic of an study is TIME. You need to collect the information in a specific time frame, and then close.
    You cannot add later... is (again) nonsense.

    I am not use to be so drastic, but this is the kind of talk that should be review before, to avoid embarrassing situation like this where numbers coming from nowhere and with no meaning are reported as real.

    all the best.

  • Justin Swanhart

    There were a number of posts on Planet inviting people to take the survey. It wasn't a random sampling, it was an opt-in survey which means it has no scientific value. Aslett can correct me if I'm wrong, but the advertised survey said it was going to be used in a Percona Live keynote, so I avoided taking it, as I'm a Percona Employee and that would create bias.

Powered by Wordpress and MySQL. Theme by