ebooksgratis.com

See also ebooksgratis.com: no banners, no cookies, totally FREE.

CLASSICISTRANIERI HOME PAGE - YOUTUBE CHANNEL
Privacy Policy Cookie Policy Terms and Conditions
User talk:Erik Zachte/Statistics - Wikipedia, the free encyclopedia

User talk:Erik Zachte/Statistics

From Wikipedia, the free encyclopedia

Contents

[edit] Stats

Erik, a first french version of WikiReports is available on fr: at WikiReports_FR.pl. -- Looxix 23:57, 14 Aug 2003 (UTC)

Thanks for the stats. The translation of the languages part is done. -- Looxix 21:38, 15 Aug 2003 (UTC)

Heya,

I just wanted to say thanks again for your excellent statistics pages. They're a real pleasure to look at, and tremendously useful both for analysis and marketing. Great work! —Eloquence 01:29, Dec 9, 2003 (UTC)

Hi Erik! Thanks on the info about the out-of-date stats due to the server switch - I thought my memory was betraying me, because I could swear I have seen more up-to-date stats than I can see now. Now everything is explained. Thank you! --denny vrandečić 15:08, Feb 5, 2004 (UTC)


Greetings. I love the Wikistats collection... A question about updating the language list: Could the language selection be changed? Scottish and Manx don't seem to be active. Swahili and Urdu are more active, and, as major world languages, merit a place on the list (if only to encourage awareness/contributors). +sj+ 10:36, 2004 Mar 15 (UTC)


Hi Erik. Thank you for Wikistats! I wanted to use your scripts on my own MediaWiki site, but I didn't see any instructions, so I played around with it a bit and found that it wasn't too hard. I've updated the meta:Wikistats article with some instructions that will hopefully be useful to others. Cheers! --Fjarlq 15:35, 2 March 2006 (UTC)

Great, I'm glad you did this. Erik Zachte 21:35, 2 March 2006 (UTC)

[edit] Namespace User: in portuguese

Hi Erik,

Just to let you know that the namespace User: is "Usuário:" in the portuguese wikipedia. I'm telling you this because we've had a few cases of people starting user pages under User: which I think came from your stats (which are great, BTW). Jorge @ pt

[edit] ROA-RUP is Aromanian

Hi Erik! Just to let you know that, for the Wikipedia Stats, the code ROA_RUP (which should actually be roa-rup with a hyphen) represents the Aromanian language. It is the only language code without a name in the stats (that's OK since I know that there are so many new Wikipedias being founded all the time!). Would it also be OK if you could update the Romanian language file for the Stats so that the name of roa-rup is Aromână (the name of the language in Romanian). Aromanian is a language similar to Romanian (a bit like Kashubian to Polish). And, one more question. Are Categories counted as an interwiki link (so, if an article belongs to two categories, and links to them via [[Category:x]] is that link counted as an interwiki?) and also, are the articles in a Category page all counted as interwikis, even though they are machine-generated (cannot be edited). Cheers (and thanks again for the great stats), Ronline from ro.wiki (talk).

[edit] Wikipedia stats

Hi Erik,

First of all, great job on Wikistats... I certainly didn't expect all of it to be the work of a single fellow Wikipedian! :thumbsup: ;)

I've got a question stemming from a dispute with User:Dbachmann... how did you arrive at the word count statistics for Chinese, Japanese, Korean, and Thai? Chinese and Japanese don't use spaces at all to separate their words, so do the counts refer to single characters? Thai uses spaces to separate sentences and clauses instead of words, so do the word counts refer to single syllables instead (or something else)? Korean does use spaces similarly to European languages, so does the word count also refer to "words" as strings of characters separated by spaces?

Thanks in advance! -- ran (talk) 20:53, Jan 2, 2005 (UTC)

Hi Ran, thanks for compliments. I had some discussion with User:Tomos about word count long ago. I decided to make a rough estimate of ratio English words vs Japanese characters by comparing the English and Japanese full sentence texts in the wikistats language files, which are supposed to be very literal translations. Same for Chinese. I ended up with one Japanese character is 0.3 English word and one Chinese char is 0.4 English words (the numbers are fictional, I'm not at home, so can't check the actual ratios right now). So I count unicode chars in both languages and multiply by a language specific factor, so far only for Japanese and Chinese. It is very rough, but at least better than usual word count algorithm that see spaces, dots etc as word boundaries. Cheers, Erik Zachte 00:57, 3 Jan 2005 (UTC)

Ah... I see. I'm impressed with the trouble you went to to account for the special orthographies used in those languages. Great job! ;)

Your word counts, btw, are now being used on the Main Page. -- ran (talk) 01:46, Jan 3, 2005 (UTC)

[edit] StatisticsUsers.csv

Hi Erik; you implied on Wikipedia talk:List of Wikipedians by number of edits that you have some responsibility for http://www.wikipedia.org/wikistats/csv/StatisticsUsers.csv. I use it for purposes other than generating the WP:1000 lists, but the csv has not been updated in some time. Is it a manual process to update it, or is there a cron job broken somewhere? Where would be the best place for me to ask that it be run again? —Ben Brockert (42) UE News 02:45, Jan 27, 2005 (UTC)

Brockert, the csv files are generated by my wikistats script. This is normally run after new database dumps have been created, which has not happened for three weeks. The databases are reorganized with a different compression algorithm, heavy job, which has been running for weeks now. After this has finished, I still need to adapt the scripts to the new scheme, which I can only start to work on when a dump is available for testing. Cheers, Erik Zachte 11:07, 27 Jan 2005 (UTC)
Ok, thanks. I shall add to the chorus of requests on Wikitech-l if it doesn't get run in the next week or so. —Ben Brockert (42) UE News 03:39, Jan 28, 2005 (UTC)

[edit] Wikipedia Statistics

These tables and graphs have always been extremely useful, quite invaluable infact. Congratulations on this work. I notice that it has not been updated since the 30 January 2004, approaching 2 months, is there a particular reason for this? If not, would it be possible to update it? Thanks. --Oldak Quill 00:38, 25 Feb 2005 (UTC)

See post at wikitech for full answer. Erik Zachte 04:01, 25 Feb 2005 (UTC)

I fear, this is the last thing you want to hear, but I'd like to suggest a new metric to be included in the (not yet working again, I know, I've read the post) statistics script: number of adminstrators. Number articles per admin and number of edits per admin would be nice too, but could be calculated by postprocessing. --Pjacobi 22:20, 2005 Mar 7 (UTC)

I have made a simple script to automatically analyse Wikipedia's growth, and extrapolate using a few models. This can be found on my user page. The current pages on this topic in the wiki itself are out of date, so I was wondering if the scripts could be used to make an automatically updated version of those pages, just like you have done with the growth history. I only had a dump of article creation dates available, so things like article deletions and size criteria aren't included, but this should be mendable by using another database query, I think (I don't have the means to do that - another user did the dump for me). Amaurea 09:20, 2 March 2006 (UTC)

Nicely done. I can't comment or even assist you on automating this. You could ask at wikitech. Best, Erik Zachte 21:41, 2 March 2006 (UTC)

[edit] Wikipedia Statistics - Oveview

Erik, It seems there is a problem in the Tamil language column in the above page. There is a mixup with another language. Mayooranathan 17:34, 28 Mar 2005 (UTC)

  • Thanks Erik, I saw your note in Tamil Wikipedia. In fact the statistics belongs to Tamil language Wikipedia is listed under "zh-min-nan", and may be this shift continues upto the end. Mayooranathan 17:32, 30 Mar 2005 (UTC)

[edit] Signpost

I hope it will be useful for you occasionally. Certainly your statistics are useful for the rest of us, I think everyone's happy to have them back. Thanks also for straightening out my facts on the article. --Michael Snow 00:59, 30 Mar 2005 (UTC)

[edit] Statistics script

Erik, is this http://members.chello.nl/epzachte/Wikipedia/Statistics/Perl.zip still the old version, working on the uncompressed history? Can you give a pointer to the new version? At de.wikipedia we found someone semi-volunteering to update the de:Wikipedia:Alternative Benutzerstatistik and I suggested, he could learn of the necessary changes, by looking at the changes you made to your script. --10:20, 2005 Apr 7 (UTC)

[edit] Statistics on Wikibooks

Hi Erik, I just saw your compilation of Wikibooks and would like to point you to the similar work I've been doing at the English version:

You extracted the book information automatically - I extracted them manually (I have a clean list e.g. free of foreign books that have been moved already, but as you can imagine it is a lot of work and not well-suited for scaling). I would like to ask for the following: Could you incorporate my SQL scripts into your statistics overview? Then you could also compile a "Top active" list of contributors and edits which is meant to focus future work on wikibooks. This would relieve me from a lot of work that you are doing anyway. My scripts are all available under the links above, and if you like I could also change them so that they work with the book information that you extract automatically. What your autmatic extraction misses so far though is:

  • Many foreign books have already been moved to their native wikibook version, and contain a prominent link to the German project. They probably should not be listed anymore.
  • Some books have a different title in the main page than on the subpages (e.g. wikibooks:Learning the vi editor)

--Andreas Ipp 12:29, 10 Apr 2005 (UTC)

[edit] Statistics

I was appointed to you to ask you about the following: When will new wikipedia big statistics come out? May 16th was the date for the last stats and I'm eager to look at the new ones! Thanks in advance --Dungodung 10:02, 16 Jun 2005 (UTC)

Wikistats job uses database dumps. In recent months dumps have been produced only with 4 weeks interval or even longer. As soon as dumps are available I will run the job. Erik Zachte 16:45, 16 Jun 2005 (UTC)

[edit] MediaWiki 1.5

Am I right in fearing, that the 1.5 upgrade once again broke your statistics scripts? --Pjacobi June 28, 2005 12:54 (UTC)

Completely. There will an XML dump some time in the future. Maybe I could use that. But no idea if and when. Actually the 23 June dumps for Wikipedia (= before 1.5) can not be processed either as much of the old table has been moved elsewhere and no dump is available for that (yet). Erik Zachte June 28, 2005 22:04 (UTC)
Sigh! --Pjacobi June 28, 2005 22:36 (UTC)

[edit] Wikipedia statistic

Please, can you update statistic for wikipedia? --Sasa Stefanovic 14:23, 21 July 2005 (UTC)

Wikistats processes the database dumps. There has been not been a complete dump since a long time. Since mediawiki 1.5 the database format changed completely anayway. A xml export file has been promised by Brion. I'll wait for that. 01:00, 22 July 2005 (UTC)
Okay… that reply was written in July. Is there still no news about when we can expect Wikistats to be updated again?
Please also see my question on your Dutch talk page about changing the localization scripts for Wikistats.
Thank you!
--Verdlanco\talk 11:13, 29 October 2005 (UTC)
Since July no complete dump was made, while admins experimented with new settings. That is until this week, coincidentally. So now I can run new wikistats, hope to do so next weekend. Cheers, Erik

[edit] category tree has a code problem

  • Hello, Erik, I am prattflora from ZH wiki. Those pages about zh category tree that you created has code problems. It could not link to any correct category. Could you please fix 分類: to Category:? It is an important tree for us. I appreciate your great works. = )

--Prattflora 00:33, 12 August 2005 (UTC)

I'll start working on new stats tomorrow and fix it. Erik Zachte 00:37, 12 August 2005 (UTC)

[edit] Stats

Sorry if I'm boring you with this, but I read somewhere that you're busy and that you'll probably have time after Wikimania for adjusting you script for 1.5. Now, the Wikimania is over and the statistics are 3 months old. I'm sure a lot of people would like them updated and you seem to be the only one. If you're not able to do it anytime soon, can you please appoint me to the script, seeing as I couldn't find it online (I guess it's open source). Thanks in advance. :) --Dungo (talk) 13:14, 15 August 2005 (UTC)

I'm working on it, adding support for xml dumps, which does not replace more complicated sql dump parsing code, but rather is integrated in such a way that stats scripts stay downwards compatible for other mediawiki installations that use it. It'll take some time. Hopefully new xml format ensures that stats script do no longer need to be updated two or three times a year. Erik Zachte 16:43, 15 August 2005 (UTC)
By the way Wikimania is just one week ago. I have a day job as well. Erik Zachte 16:45, 15 August 2005 (UTC)
Thanks for replying. I know you're busy and I just wanted to check if you're actually doing something about it. Good work and keep on wikiing. :) --Dungo (talk) 23:52, 15 August 2005 (UTC)

[edit] stats

hi Erik. yes, i said to mav that it may be possible to use the new statistics (logwood) to produce these statistics.. however at the moment there is no "per month" data, i need to do some work on it first (i'm sure you're familiar with this problem ;). eventually, though, the data will be (and is) available in a MySQL database, which should be quite easy to query from the statistics. —kate

[edit] Referer spam

There is a lot of Referer spam in the stats. The accesses associated with the referer spam are probably imposing a detrimental load on the servers. I think the stats should stop showing referers in order to discourage the referer spam. Many thanks, Main Page 16:29, 1 September 2005 (UTC)

The visitor stats are not mine, please ask Kate. Erik Zachte 19:16, 1 September 2005 (UTC)

[edit] Wikipedia:Orphaned articles

The discussion there shows that you know about the perl script whichs establishes this list. Those lists date from June 2004, and are not helpful anymore due to the long period. Can you do something? Or would it be wise, to propose it to Wikipedia:Articles_for_deletion, to attract the attention that is needed? --KaPe 16:59, 18 January 2006 (UTC)

I published such a list long ago, but the current list has been produced with a different script, see [3]. The history list shows the author is probably user:Docu, who published the script. Also did you check http://en.wikipedia.org/wiki/Special:Lonelypages which is also mentioned on 'Wikipdia:Orphaned articles' ? Erik Zachte 19:21, 18 January 2006 (UTC)

[edit] Charts of language statistics

Hi! When do you plan to update the charts?--Nixer 02:00, 13 February 2006 (UTC)

New xml dumps were only produced recently. First since Dec 13. Either en: dump is incomplete/incorrect or the format changed again. So this needs some extra attention, when I find the time. Erik Zachte 23:40, 14 February 2006 (UTC)

[edit] Stats

Hi Erik, when will you update statistics ? — SasaStefanovic • 03:26 28-02-2006

I will but can't say when. May take some weeks. Sorry. Erik Zachte 18:26, 28 February 2006 (UTC)
Hi. I don't mean to rush, but will that be soon? I'd be keen to analyze them. — fi:Käyttäjä:Hautala 22:57, 1 April 2006 (UTC) P.S. Keep up the good work.

[edit] Page access statistics

At Wikibooks people do not change the text every day. Readers often only access the PDF version of a book. This means that page accesses or hit counts are the best method of estimating use, not edit counts. Is it possible to have a dump of the Apache access logs ( http://httpd.apache.org/docs/1.3/logs.html#accesslog ) or some other page access log? This issue has been discussed several times, most recently at: http://en.wikibooks.org/wiki/Wikibooks:Staff_lounge#Hit_Counting It is very demoralising for someone, or a group, to write a large book and then have no idea whether it has been read! 80.7.36.66 15:32, 18 April 2006 (UTC)

Visitor stats used to be extracted from Apache logs with Webalizer. I then copied those results into Wikistats. However from October 2004 this process was stopped by the admins, as transferring many Gb of Apache logs to one server to merge them before further processing took to much processing cycles and LAN bandwidth. This issue has been raised often on wikitech as well. Until someone implements a solution that e.g. only stores 1 in 100 web accesses (good enough for stats) nothing can be done. Erik Zachte 19:43, 18 April 2006 (UTC)
As you know, counters are a standard feature of Wiki software - see Hit counters at bottom of Orthodox wiki . Another approach to sampling might be to turn the counters back on but to only provide code for activating the counters for short periods on one or two pages selected by administrators. This approach would probably answer the needs of Wikibooks. At first inspection this solution would require no more than re-activating counters in the profile and providing a "hit count" template for admins to use on chosen pages. Is this possible? How could we ask for it to be done at Wikibooks? Robinhw 10:44, 20 April 2006 (UTC)
Please discuss this further with Wikipedia admins e.g. at wikitech mailing list. Cheers, Erik Zachte 23:12, 20 April 2006 (UTC)

[edit] .csv of block log?

I am interested in obtaining a .csv of the block log for statistical analysis. Can you offer any helpful suggestions on obtaining these data?--HypatiaSD 21:12, 18 April 2006 (UTC)

The blog log is not part of the public xml dumps. The wikimedia download pages lists it as 'Data for blocks of IP addresses, ranges, and users. (private)' Please ask about it e.g. at wikitech mailing list. Erik Zachte 22:03, 18 April 2006 (UTC)

[edit] Stats

I was wondering when the stats pages are likely to be updated on any of the Wikimedia projects. Also the stats for Wikisource have never been compiled for the individual language domains. I am trying to analyse the success of various communities at maintaining a certain activity level. Any advice would be appreciated.--Birgitte§β ʈ Talk 17:13, 19 April 2006 (UTC)

I read your post on the foundation mailing list. For several reasons I have not been able to do major Mediawiki work for some months and will keep low profile even longer, but I'll try to run stats once the system is back online and a recent dump has been completed succesfully (Brion works on that). Cheers, Erik Zachte 23:33, 19 April 2006 (UTC)
Stats look beautiful. Thanks for all you hard work.--Birgitte§β ʈ Talk 12:57, 23 May 2006 (UTC)

[edit] wikipedia stats 2

hi!

the wikipedia stats for de haven't been updated since a half year!

would you please make a fresh one?

thanks, an german wikipedian

There are several reasons for that. One of them is that dumps for large wikipedias tend to fail more often. Please take a look at http://download.wikimedia.org/ The German dump failed again during last run, as did the the English (same happened on previous run one month ago). German dump is now rerunning, Two days to go. Fingers crossed. When these dumps fail the only thing you and I can do is draw the attention of developers to this. Cheers Erik Zachte 21:48, 28 April 2006 (UTC)

Hi there. Well I understand the problem with large Wikipedias, but luckily I'd like to ask you for czech statistics. Czech Wikipedia is growing, but still makes no problem I hope. Could you please make fresh statistics for czech wikipedia? Thanks --Zirland 18:14, 9 May 2006 (UTC)

I'm working on new stats (as always scripts need update, Wikimedia environment is always in a state of phlux). I also started new and demanding daytime job and am in the process of moving to another city. So ... Erik Zachte 21:13, 9 May 2006 (UTC)
When you get settled in your new home and get a less demanding boss, I would like an update for the Hebrew Wikipedia [4] :-) Yonidebest 23:08, 18 May 2006 (UTC)
Fingers crossed all stats will be online this weekend. OR mid week at last. Erik Zachte 01:26, 19 May 2006 (UTC)

[edit] About bots

Hello Erik: I recognize that it is a smaller question, but I ask myself: Would be possible to clear editions of bots of the statistics?. It is a little frustrating, for those who we are not machines, to be outside the ranking because the ten or eleven major editors are bots ;-)). It is my case in the Spanish Wikipedia. In any case, an affectionate greeting and my recognition for your extraordinary work. PACO 15:34, 25 May 2006 (UTC)

Yes that was on the back of my mind for a long time. One hurdle to take is that that info is not in any public xml dump. Also, I'll have to work on other projects (mainly EasyTimeline) first. Erik Zachte 16:14, 25 May 2006 (UTC)
Many thanks for your response. I only wanted to put on the table the subject. The solution when the work allows it! ;-). Cheers PACO 14:58, 30 May 2006 (UTC)

[edit] Category Overview, bug?

Hello, we tried to check circular references which were found by your script and published at [5]. However, we did not find the circular reference at cs:Kategorie:Šablony souřadnic. It seems, the algorithm contains a small bug, it ignores

<nowiki>

tags.

Thank you very much for your script and care about this. Miraceti 19:55, 4 June 2006 (UTC)

Sorry, I am looking at that category again. The problem was a bit different. The algorithm steps in
<noinclude>
tag. It is a question whether this is good behaviour. Probably yes. Sorry once more. Miraceti 20:00, 4 June 2006 (UTC)

[edit] Status of m:General User Survey/Questionnaire?

Hello Erik. I just stumbled upon this, it looks quite good - but why has the work stopped over the past few months? I am studying for PhD in sociology, and I will be conducting my own survey of Wikipedia users soon. I'd be happy to share some of what I've learned in preparing my survey with you guys, I'll leave some comments on survey talk page soon.--Piotr Konieczny aka Prokonsul Piotrus Talk 16:13, 7 July 2006 (UTC)

[edit] question about stats update

Hi Erik, thanks a lot for the update! I have a question though regarding the new stats as there seems to be some drastic relative differences between some wikis and absolute number falls in the numbers especially regarding the de version (new articles/day and new wikipedians for instance are the most obvious) so I was wondering whether this is due to some bias (like measuring de or ja at the beginning of the month and fr at the end) or is it due to something else ? Thx a lot for your infos and keep up the good work! LeYaYa 02:49, 26 July 2006 (UTC)

Hi LeYaYa, yes I noticed as well but have not investigated yet. Will do so after Wikimania. My hunch is that input dump was incomplete. Dumps files for largest wikipedias are still erratic from time to time, often fail alltogether. I hope next wikistats run will prove this is the case. Erik Zachte 18:26, 26 July 2006 (UTC)

[edit] Wikistats Forbidden

Hi Erik,

I was just curious why I've been getting a 403 on the Wikistats site. I only learned about it a week or so ago, and judging from the Google cache, it's very good. Still, for as long as I've known about it I've been getting a "Forbidden" error message on all of its pages. I can't seem to find any published reason for this with a quick scan of the related information. Can you enlighten me on this?

Thanks for your time! — supreme_geek_overlord 03:02, 13 October 2006 (UTC)

Update: Oops, misspelled your name. Sorry! Please forgive an ignorant American. ;) — 03:06, 13 October 2006 (UTC)

Problem persists and page seems to be dropped from the Google cache by now. --Pjacobi 09:19, 25 October 2006 (UTC)

[edit] Wikistats work now but...

Hi Erik. Nice to see the stats work again, but i have one small and one freat problem with it, especially with http://stats.wikimedia.org/wikisource/CS/TablesWikipediaCS.htm , where I am a bureaucrat (without special knowledges in programming etc.). The small one: The users on the Czech Wikisource are not named Wikipedista (like on the Czech Wikipedia), but Uživatel – now they cannot be reached by a clik.
And the main problem: in the second table (active wikipedians) there are two users, namely Odis and Nopreceja; beside the fact that I cannot remember somebody like this (subdomain created March 2006, I am working there and on wikisource.org since August 2004), I cannot find them neither in the users on the Czech Wikisource nor on the Wikisource.org. But, When I edit the edit counter for them, see http://tools.wikimedia.de/~interiot/cgi-bin/count_edits?user=odis&dbname=cswikisource_p and http://tools.wikimedia.de/~interiot/cgi-bin/count_edits?user=nopreceja&dbname=cswikisource_p, I find them both there, with some edits, the first one on 10. October 2006 (Odis) respectively 23. October 2006 (Nopreceja). I controled these days and there is no edit from them in the recent changes as far as i can see.
The problem is, that some arbitrated and blocked users, who were active in the Czech Wikipedia, are now active also on other projects, in last days Meta and en.wiki; some seem to be good in programming etc. So, i would like to clear this strange appearance of the two users to be sure there is nothing what could be dangerous for the cs.source, which, on the other hand, is working quite well. I thank you for your help, and please, answer me here. Thanks a lot. -jkb- 14:07, 22 November 2006 (UTC) (Jan, Berlin).

You can see the edits for odis here: [6]
I downloaded the latest dump: pages-meta-history.xml.bz2 and unzipped with bunzip2. The 58 Mb unzipped file contains all revisions on all articles with their authors. I scanned for 'Odis' (scan case sensitive) and found this 3 times:
<contributor>
  <ip>Odis</ip>
</contributor> 

A registered user like you is listed in the xml dump as

<contributor>
  <username>-jkb-</username>
  <id>4</id>
</contributor> 

So it seems the Mediawiki parser treats Odis as an anonymous editor. Erik Zachte 15:34, 23 November 2006 (UTC)

I thank you very much - for the answer, for you work and therefore for your help. Yes. Now it is clear, it seems to be a bug of Transwiki. Both users edited twice or three times a page in the czech Wikipedia, which has been transwikied later to the czech Wikisource. On this occacion, so I resume, we have got on the wikisource also some database notices on it, and although they never registered and never edited on the czech wikisource, the appear in the statistics. This is only another bug problem of transwiki, as I stated in s:cs:Wikisource diskuse:Transwiki (now in Czech only). So many thanks for your help. How to make a dump - i don't know, i think I can imagine a bit, what it could be, but... :-), so long, -jkb- 17:18, 23 November 2006 (UTC)

[edit] more about Wikistats

Hi Erik, ! I have been in contact with Jon Harald Søby, who suggested me to ask you. I am writing a project application where I need updated statistical information about the Lombard Wikipedia. The problem is that the author statistics I found in May gave the choice between 'Wikipedians' and 'Wikipedians with > 10 edits', while now, in December, at this stats page ( http://stats.wikimedia.org/EN/TablesWikipediaLMO.htm ) the choice is between 'Wikipedians' and 'Wikipedians with > 5 edits' -- thus making comparison with the older data impossible. How can I get back the info for 'Wikipedians with > 10 edits'? Could you help me with this issue, or possibly address me to someone who can do sthg. about it? Yours --Jorgengb 00:14, 8 December 2006 (UTC)

Hi again, I was just wondering whether my request might have been overlooked by mistake... :-) -- Yours --Jorgengb 14:09, 18 January 2007 (UTC)

Jorgengb, yes it was overlooked, sorry about that. I'm afraid you are mixing things up. It has always been > 5 and > 100 edits (for this month). The first columns uses > 10 edits (for all time). Some people confuse tables and charts, where tables use > 5 per month and charts > 5 per week, as they have different granularity, but that is another matter. Cheers Erik Zachte 22:26, 18 January 2007 (UTC)

[edit] a new translation for user

Hi, me again :-)... For the statistics of the Czech Wikisource i would need a new translation of the word USER (in the second and third table). Now all users are translated with Wikipedista:XY, which is taken from the Czech Wikipedia, but in the Czech Wikisource we use the translation Uživatel:XY - see e.g. s:cs:Uživatel:-jkb-; now it is not possible to follow the link - they are wrong. Could you manage it somehow pls? Thanks, -jkb- 16:04, 17 December 2006 (UTC)

Wow, many thanks, I have just seen the changes. Thx, -jkb- 10:27, 18 December 2006 (UTC)

[edit] Strange thing in stats

Hello Erik. Sorry to boring you with some minor questions, and maybe you have no answer to give me : on the statistics for french wikipedia, I see that between august and september 2006 the percentage of articles above 2ko fell suddenly from 34% to 29%. How is that possible? How could such a great number of articles lose octets in one month? If there were a massive bot production at this time, it could be explained, but there wasn't and it was a normal month. It is a mystery, and the october-november figures seem to confirm this strange fall... Anyway, thanks for your great job! See you, Enmerkar 02:52, 1 January 2007 (UTC)

I have maybe understood. Does the figure (29%) represent the percentage for the whole wikipedia, or just for the new articles of the new month? Enmerkar 03:01, 1 January 2007 (UTC)
The percentage is for all namespace 0 articles, regardless of whether they created or even edited in that month. I have no ready explanation for this either. A bot would be the first thing I would think of as well. Possibly data were moved from body text to template? By the way there was a remarkable increase earlier from Jun to July 2005. It is peculiar indeed. Cheers, Erik Zachte 14:24, 2 January 2007 (UTC)
Thanks a lot for your answer. There was a great increase in June 2006 (is that this one you're talking about ?) but it was about 3000 articles (allegedly, that was some articles forgotten by the official system count), not so much. Is it possible that an error occured in the stats, at one time, and that this error is repeated again ? Or the stats and these numbers (29%, and now 28%) are unquestionably right ? Anyyway, happy new year! Enmerkar 21:47, 5 January 2007 (UTC)
No software product is ever unquestionable. Having said that a wikistats bug is not very likely in this case. Every run all data for all months are generated, so all figures are built using the same algorithm. Erik Zachte 23:15, 5 January 2007 (UTC)

[edit] Update

Hi, When are we going to see a stats update? I am primarily interested in the Wikinews statistics and the user rankings. Nzgabriel 08:20, 26 January 2007 (UTC)

When new dumps are available. Erik Zachte 13:23, 26 January 2007 (UTC)

[edit] Wikimedia growth table

I really like this table, nice work! However, you're way overdue for the updates you promised — Jack · talk · 07:55, Sunday, 18 March 2007

You're absolutely right, Jack! Right now the wikistats job for the new English Wikipedia xnml dump (finally!, first in 2007) is running. In a few days when its finished and all stats are complete for 2006 I will update the chart. Erik Zachte 15:33, 18 March 2007 (UTC)

[edit] Statistics for all Wikipedias

Hi! In Statistics for Wiktionary in polish localization is a small language bug. In Polish language Yiddish is jidysz (no jidisz like a cashubian language). Great --joystick 03:47, 27 May 2007 (UTC)

Thx, I'll update the script. Erik Zachte 14:20, 27 May 2007 (UTC)

[edit] Commons

There is something wrong with number of images: [Total number of images presented here]. Only 18k images and 61k articles... is it ok? But separating people nad bots in other tables... :) thx Przykuta 21:42, 5 June 2007 (UTC)

Try include {{NUMBEROFARTICLES}} in any page in Commons and preview (don't save), right now it shows 61,357 'articles'. Many files on commons are only linked to an image description page. My count applies to namespace 0 articles, as does the macro that on other wikis is included on the main page.
The images count is actualy the number of image links [[Image:....]]. I readily admit that for Commons these stats are somewhat confusing and not so relevant. I may need to update explanations below the table. Thanks for the feedback. Erik Zachte 00:07, 6 June 2007 (UTC)

[edit] February effect

Hi, I have another problem. Look at active users and very active users. In February we lost them: the end of Yanuary - growth, half of February - drop, the end of February - drop. It is regular in 2006 nad 2007 in all Wikipedias. I used it as question for board elections ;) (a simple test) Przykuta 14:20, 20 June 2007 (UTC)

[edit] Wikistats csv

Hi Erik,
thank you for your job on stat. Only a little question: the file csv.zip containing Wikistats csv is no longer updated with the latest statistics? I downloaded today and is the same of latest stat (30 April). I ask this because I have created a little Python script to extract data from csv file to write a deeper detail of Italian wikipedians (up to 1000 users), but if data aren't updated this become useless. I notice in this page a previous post (very old) about csv and difficulty to generate, so I wonder if the same process generate stat for publishing and cvs file at the same time (but the file need to be updated manually) or there are two different processes to run or ... something else. Thanks Pier from it.wiki 09:11, 1 August 2007 (UTC)

csv files are updated as part of wikistats job, the zip with all csv files should be refreshed automatically Erik Zachte (talk) 06:07, 29 February 2008 (UTC)

[edit] Wikistats for af.wikipedia

Hi, Erik

Hope I'm at the right place (and person!) for this. I've noticed that the Afrikaans Wikipedia's statistics [7] haven't been updated since June. Just curious as to why that is and if we (at the Afrikaans Wikipedia) can do anything to help speed up the process. Regards, Anrie 12:31, 24 September 2007 (UTC)

Gah! Sorry, just saw you request that statistics talk go on the relevant subpage. I'll repost. Anrie 12:33, 24 September 2007 (UTC)

[edit] Wikistats for af.wikipedia

Hi, Erik

Hope I'm at the right place (and person!) for this. I've noticed that the Afrikaans Wikipedia's statistics [8] haven't been updated since June. Just curious as to why that is and if we (at the Afrikaans Wikipedia) can do anything to help speed up the process. Regards, Anrie 12:34, 24 September 2007 (UTC)

Dumps are produced only with large intervals and are often corrupt these days. See [9] : dumps for all major Wikipedia's have failed recently. I may wait a little to see it that can be fixed. Even best case wikistats are always 1.5 - 2 months behind due to lengthy dump process. Erik Zachte 01:07, 25 September 2007 (UTC)
Okay, thank you for the info and your continued effort to provide us with statistics. Anrie 09:06, 25 September 2007 (UTC)

[edit] New Wiki stats?

I've been checking the wikistats website for a couple of weeks; I notice it hasn't been updated since May 15. Is a planned update coming soon? --Smeira 16:41, 2 July 2007 (UTC)

Even best case there is a lag of about a month due to time it takes to make dumps. Unfortunately it is often worst case, when some important or even most dumps fail, like happened last month. The dump process is still very erratic, despite many improvements. Too few developers. And priorities elsewhere. Erik Zachte 01:09, 3 July 2007 (UTC)
Hello, I want to tell you also that your wikistats are used in german. For the future the solution can be perhaps the new text server on the toolserver-cluster, look at the image there. Greetings Kolossos 17:33, 20 October 2007 (UTC)

[edit] hi, Upgrade features

http://stats.wikimedia.org/EN/TablesWikipediaKO.htm

I used 2 IDs. this ID is third.

I want to merge 3 ID statistics.

Many people use several IDs.

Please, Upgrade features... :)

-- WonYongTalk 07:33, 7 November 2007 (UTC)

[edit] stats for hi:

Please include hindi wikipedia's ststs @ http://stats.wikimedia.org/EN/TablesRecentTrends.htm --hi:user:Sumit sinha —Preceding unsigned comment added by 59.94.136.136 (talk) 15:40, 26 November 2007 (UTC)

[edit] statisitc of wikimedia projects

Can you script make a another page to put "Circular references found" instead of the bottom of "Category Overview" page ?
Thanks lot for this project
Crochet.david 82.243.133.6 (talk) 14:43, 19 January 2008 (UTC)

[edit] Average per month?

Dear Erik,

First thanks for the great statistics! But I have one question about it. About the offical count of articles. Is this an average of the month? or per first of last date of the month? (this question due to a question on my discusion page with rubietje.) Already thanks for the answer and keep up the good work! user DZ on dutch wikipedia. —Preceding unsigned comment added by 82.176.196.36 (talk) 21:26, 28 February 2008 (UTC)

Last day of the month. In the past I ran stats on any day and up till the last data in the dump so that counts for a partial month were displayed on the topmost line (with exact date in the leftmost column), and extrapolated predictions for the whole month below it. However with dumps taking longer and longer to produce (English dumps > 6 weeks!) this was no longer tenable. Some articles were dumped days or weeks earlier than others.
Now I wait till all dumps have started in the current month, in other words contain full data for the previous month, and then process data until the last day of the previous month, and skip records for the current month. I hope all of this makes sense despite time of day and glass of wine :) Erik Zachte (talk) 06:05, 29 February 2008 (UTC)
Thanks, Dick ! —Preceding unsigned comment added by 82.176.196.36 (talk) 10:38, 29 February 2008 (UTC)

[edit] Updates regularity

Wikistats is really a nice thing, but, since the dump process is so tiresome and messy - perhaps you should write frankly analyzing the database dumps, which are usually created quarterly (or even once in half a year) instead of usually created monthly? Otherwise people are cherishing vain hopes ;) Al Silonov 19:04, 3 March 2008 (UTC)

They are usually created monthly. But people don't realize even in the best of circumstances they are still up to 2 months behind. For example today the wikistats job for January finishes. It could only start when all dumps contained full data for January, i.e. had been run in Februari. New stats will be the most recent until well into March. Anyhow feel free to update any description. Erik Zachte (talk) 23:12, 3 March 2008 (UTC)

[edit] WikiStats Wikibooks tr

Hello Erik! "5 recently active wikibook authors" and "20 recently active wikibook authors" statistics of tr_wikibooks are out of the date. Can you update them? Here is the stats link. Srhat (talk) 12:47, 1 March 2008 (UTC)

Thanks for update Srhat (talk) 20:05, 4 March 2008 (UTC)

[edit] BOT statistics

Hi Erik,

My robot (MSBOT) is NOT in "bot" user group in fa.wiki but the statistics lists it in the bots. I just wonder about its reason. Thanks for your nice statistics. -- Meisam 20:49, 7 March 2008 (UTC)
On most smaller wikipedias bot registration is incomplete, probably because of the effort involved to registrate a bot 250 times. So I make the following assumption: if a name is registered as a bot on 10 or more wikipedias, it probably is a bot here as well. MSBOT is registered on 17 wikipedias. If Brion ever completes Single User Login (one never knows) this assumption is no longer needed. Erik Zachte (talk) 03:26, 8 March 2008 (UTC)
Thanks for your answer. Take a look at here about the statistics of my bot. It's bot in 19 wikis. -- Meisam 07:27, 8 March 2008 (UTC) —Preceding unsigned comment added by 85.185.3.22 (talk)


aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -