Radio Galaxy Zoo Talk

RGZ was down for many hours today (1 Jan, 2016); how to avoid this in future?

  • JeanTate by JeanTate

    Apparently there was a problem earlier today (1 Jan, 2016) that affected many active Zooniverse projects, leaving Talks unaccessible, classification sites dark, and more.

    Of course, many ordinary zooties noticed this immediately, and some wrote about it in Zooniverse Talk.

    After ~12+ hours, a member of one Science Team (not RGZ) contacted someone on the Z dev team, and the problem was sorted out quickly.

    As far as I know, no ordinary zooite, nor even any moderator, has the contact details of someone on the dev team who can sort things like this out, so there is no way for any of us to act on such outages during a holiday (or a Sunday?).

    Of course, if the problem were major, requiring lots of resources to fix, that'd be very serious for the Zooniverse; however, in this case it seems it took little time and effort to fix.

    How can such things be avoided in future, do you think?

    Posted

  • csunjoto by csunjoto

    Take a break from classifying every 1st January ?
    Just kidding 😃 ... Happy new year Jean & everyone

    Posted

  • ivywong by ivywong scientist, admin

    @JeanTate & @csunjoto.

    Apologies about the NY inconvenience. It is difficult for us to access technical help during the holiday period because the holiday period is effectively 2 days rather than 1 due to the many timezones in which we are distributed. I am relieved that things have been sorted after 12 hours. IMHO, this is actually a very impressive timescale on a major worldwide holiday. I do see your point about the problem and I am sure that the developers will do their best to minimise such outages in the future.

    Thanks again for the heads-up & happy new year to all!

    Posted

  • JeanTate by JeanTate in response to ivywong's comment.

    Thanks Ivy.

    I should clarify one thing: Meg (a key Science Team member/PI of P4, PH, and now CH) contacted the dev team, but I can't say when; also, it's not clear that her contacting them had a substantial impact on when the fix was done.

    In any case, there seems to be a real opportunity to shorten the time to resolve at least these kinds of outages; namely, faster communication to whoever in the dev team is 'on duty' or 'on call' during holidays.

    Posted

  • sisifolibre by sisifolibre

    12 hours zoouniverse outline???? the staff had a big party!!! 😛

    Happy new year and caution with the champagne the next time! 😉

    Posted

  • csunjoto by csunjoto in response to ivywong's comment.

    Thanks Ivy, Personally i don't see this inconvenience as a big thing, it's a holiday everywhere. A little bit delay to fixing this site at 1st January is understandable for me 😃

    My thanks for anyone from dev team who gave his/her holiday time to repair entire Zoo Universe. Every god in every universe need to take rest in the end week/year

    Posted

  • ivywong by ivywong scientist, admin in response to csunjoto's comment.

    Thank you very much @csunjoto for your understanding. 😃

    Posted

  • JeanTate by JeanTate

    Guys, this isn't just a minor inconvenience, to give key people a break.

    The Zooniverse has over a million registered members, so that's 1+ million passwords, email addresses, and user names. A very tempting target for a hacker. And the database of names etc is constantly in use (that's how we log/sign in). And one sign that a system has been hacked is that it goes down (of course, not all hacks result in an offline system). And when better to mount a hack than when no one is 'watching the store'?

    Of course, the Zooniverse does take protecting our personal data seriously. They have to, given the various laws on such things in the UK (for example); of course, we do not know how they do this (and we shouldn't, except perhaps at a very high level). But a corollary is that we need to be reassured that our personal data is safe, when the system goes down for ~12 hours ...

    Just my $0.02's worth.

    Posted

  • ivywong by ivywong scientist, admin

    Thanks @JeanTate. We do take your security seriously.

    Posted

  • JeanTate by JeanTate in response to ivywong's comment.

    Thanks Ivy.

    A suggestion: when there's an apparent major event like the one on NYD, how about an announcement from Z HQ, firstly acknowledging the event (or explaining why it didn't really happen, despite appearances): secondly briefly describing what happened (or didn't) and pointing out that it had no security implications (or?); thirdly saying something about what measures have been put in place (or will be) to lessen the chances of such a thing happening in future.

    As I understand it, this is kinda 'best practice' for entities which has as large an internet presence as the Z. And keeping mum is classic 'bad practice'.

    Posted