Decay article I: The mysterious birth rates

By Jens Kvorning, Jens Mammen & Morten Kjeldgaard

In this blog post, we will illustrate the data fabrication fraud perpetrated by Helmuth Nyborg in the socalled "Decay article" [1]. Until now, this web site has been an educational resource for the Danish public in Danish. However, due to the large international interest in — and misunderstandings of — the recent conviction of Helmuth Nyborg for scientific misconduct in connection with the above mentioned article, this series will be in English.

In general, the "Decay article" is utterly flawed, ignoring known facts and presenting postulates without any justification whatsoever. In this blog post, however, we shall bypass the basic flaws of the "annuity model" and simply focus on a few basic concepts of demography which have been a source of great confusion in the "Decay case", namely the two very different measures called "crude birth rate" (CBR) and "total fertility rate" (TFR). The confusion is primarily caused by Nyborg's lack of understanding of these two measures until very late in the DCSD hearing process. Briefly:

  • The crude birth rate (CBR) is typically given as the number of births per 1000 population. When this number is computed per person, it actually represents the probability that a person in the given country, man, woman or child of any age, becomes a parent in the given year [also see the Wikipedia article].
  • The total fertility rate (TFR) is a theoretical measure computed for an average female, given the current age-specific fertility rates of her country, and assuming she survives from birth to end of her reproductive period. It is a popular measure often cited in the press and educational material, because it is easily understood as the average number of children an average woman gives birth to in her lifetime [also see the Wikipedia article].

In short, these two measures are very different. Both are calculated from raw, age-specific population counts and number of child births categorized by mothers age. But there is no theoretically justified conversion of one measure to the other. Such a conversion is also not necessary, since tables of both measures are readily available, from UN, World Bank, the CIA, and other agencies that collect and assemble population related data.

What the annuity model requires, Nyborg doesn't have

This blog ignores the abject poor thinking behind the "Decay article". There is hardly a correct or even reasonable assumption in the paper, and the home-brewed demographic "annuity" method is deeply flawed. In this blog post however, we are merely attempting to reproduce Nyborg's (Vig's) deeply flawed calculations, and we show that this is not possible given the information Nyborg has provided in the Decay paper or any of the many detailed documents published by Vig on his web sites.

In the "Method and analysis" section of the Decay paper Nyborg reports the socalled "annuity model" being used for the population projections:

$$N_{1979}\times(1 + (b-d)/1000)+i_{fo}+i_{np} \tag{1}$$

where N1979 is the population count for the year 1979 (Nyborg calls this 'status count'), b is the crude birth rate, d is the crude death rate, ifo is the increase in the number of citizens of foreign origin and inp is the number of naturalized citizens (presumably from the year before). The denominator of 1000 accounts for the "per 1000 population" unit implicit in the CBR data. (However, the formula as it stands is incorrect, since crude death rates are given by 10,000 population in Denmark.) But given these data in the right units, it is in principle possible to compute the population count in 1980 and following years using the formula recursively.

In the Decay article Nyborg cites the United Nations as the source of the data used for the "retro-corrected projection" of the total population of Denmark, which he says is:

» based on the total birth rates provided by United Nations (UN: for each of 235 COO [countries of origin] «

and in the article's reference section, a link is given to the place Nyborg claims the data can be found, namely a poster from UN's Department of Economic and Social Affairs, Population Division, called World Fertility Patterns 2007 [2]. Apart from the fact that "total birth rates" is an undefined quantity, this sounds very authoritative, UN's population statistics is certainly a trustworth source of information. The only problem is that the data listed in the poster cannot be used in the "annuity" formula given in the Decay paper! The reason is that the poster lists TFRs but the "annuity" formula requires CBRs.

We immediately realized this very serious problem when examining the Decay article in detail, and in our response to the DCSD (UVVU) of Dec. 7, 2011 we pointed out that the given reference — contrary to what is stated in the article — could not be the source of data for the Decay article (page 10 and footnote 13).

The DCSD offers Nyborg an escape

In its draft decision of June 14, 2013, some DCSD members speculated that Nyborg simply had given an incorrect reference by mistake. They actually threw him a life-saver. However, in his response of July 14 to the draft decision, Nyborg clearly stated that he had not:

»In the following, I will show that the majority's assessment is based on false premises, as evidenced by the data source referenced.

The attached Annex 1 is the table that appears when you will search the source reference (link) in Decay article:

If you now look under the "Year 2000-2005", "Total fertility per woman" is seen actually the birth data, the UN sets for individual more, less and least-developed countries (UN's, not my terminology). It is actually the number on which the model analysis is based. It is these figures that form the basis of the proportional conversion to the number of children born per 1,000 per year in the sub-population by ethnicity, as described in Decay article.« [My translation].

Apparently, Nyborg never understood what the problem of CBR vs. TFR was. Remember, he didn't develop the "annuity model", he didn't carry out the calculations, he didn't assemble the data, he didn't make the graphs, and he didn't make the analysis. Vig did. Here, for the first time, Nyborg mentions something about a "proportional conversion" of TFR to CBR, which he claims is "described in the Decay article". But it is not. The Decay article contains the term "proportion" 8 times, and none of these have to do with a computational conversion of TFRs to CBRs. Nyborg is lying to the DCSD.

Nyborg lies... again

In his response to the DCSD of Sep. 12 Nyborg claims that:

»... the description of the conversion had to be removed from the Decay manuscript (along with several other clarifications and elaborations) because the editors required the manuscript reduced to 5,000 words. This time, as well as previously (as editor or reviewer), I have protested to publisher (Elsevier Science, Oxford) against what I consider to be an unfair and inflexible standard limitation of research manuscripts, but without success. The editors simply give all researchers choice between shortening their manuscript, or to accept a priori rejection of the publication.« [My translation].

So, Nyborg claims the description of the so-called "proportional conversion" was present in an earlier draft to the manuscript, but it had to be removed at the request of the editors. Only, this is a lie.

We know it is a lie, because we have access to an early draft of the methods section, which in is written by Nyborg's ghost author Jørn Ebbe Vig. And there is no mention of any "proportional conversion" in that. In fact, it is very closely related to the final version that appeared in Personality and Individual Differences. Also, there is no mention of any conversion procedure in any of Vig's numerous blog posts and papers on the subject.

Heavily pressured, Nyborg in his Sep. 12 response writes:

»But since the issue has been raised, September 9, 2013 I requested that the publisher Elsevier Science bring a brief description of proportional conversion as an addendum to the Decay article. The request well extends the total article limit of 5,000 words and changes nothing in the article's main conclusions, but it is my hope that the publisher will still bring it.« [My translation].

The text of the Addendum, Nyborg claims to have sent to Elsevier Science on Sep. 9, 2013 can be seen here. It reads:

»The analysis uses official UN Total Fertility data downloaded from
As the simple projection model used cannot accept Total Fertilities, it was necessary to proportionally convert the UN Total Fertility to Crude Birth Rates via Nationmaster
which covers both total fertility and Total Crude Birth Rates. This procedure changes nothing in the overall outcome of the study.«

While the link now is dead — Nationmaster has reorganized their web site — the link used to lead to a page showing the birth rates of Afghanistan. Fortunately, the page can still be found in the Wayback machine, and if that ever goes away, we have a screendump of the page here.

As you can see, there is no mention of any procedure, algorithm or conversion factor or anything else explaining how to carry out the TFR to CBR conversion. There is only crude birth rate data from Afghanistan, and it is difficult to understand what that has to do with anything. In fact, what Nationmaster provides is the same as UN's population division and all other data providers do, namely tables of CBRs and TFRs. Nationmaster provides data assembled by the CIA, which are perfectly fine and not very different from UN's. So why not just use their CBRs directly, instead of insisting on a strange and methodologically flawed multiplication procedure? Well, the explanation is pretty straigthforward, as we shall see later.

How Nyborg could believe that anybody would fall for such a disorganized, confused and misleading "addendum" referring to Afghanistan is incomprehensible.

What a newspaper opinion piece can reveal

The next piece of information that becomes publically available is when Nyborg on Jan. 10, 2014 publishes an opinion piece the the weekly newpager Weekendavisen with the presumptuous title "Do the arithmetic". Nyborg ridicules his critics, accusing them of not being able to do simple calculus, but in the same paragraph he admits that:

» However, the method section in the Decay article lacks the description of a proportionality conversion that converts UN's Total Fertility rates to Crude Birth rates, that is needed by my population model.« [My translation].

So perhaps the critics had a point after all? A few days later, Vig — with his characteristic indiscretion — on his blog publishes what appears to be a draft of Nyborg's opinion piece, which is longer and much more technical. Most likely, Weekendavisen requested Nyborg to shorten it and remove the boring detail. But here the conversion procedure is specified in detail, so we can actutally see what Nyborg (read: Vig) has done:

»First, the reader should google the specified link in the article to the UN Decay birth data:

Look up for example Eritrea, and find the UN estimated Total Fertility Rate for 2000 (ie, 5.20).

Next, google the second specified link:

Here, look up the year 2005 and find Crude Birth Rate for Eritrea (i.e. 38.71). Also find NationMasters Total Fertility Rate for 2003 (ie, 5.74).

Now the reader the necessary data for the proportional conversion. Plug the numbers into the following equation: 5.20 * 38.71 / 5.74.

This example should hopefully lead the reader towards the figure 35.07, which was what I found and used in the analysis. This will confirm that I referred correctly to the UN data source and further demonstrate how the conversion works for the analysis of birth data for Eritrea.« [My translation].

Never mind that the link is constructed and doesn't work and that the numbers are wrong, the important thing is that we now know how to do the conversion:

$$CBR_{Nyborg} = \frac{TFR_{UN} (2000-2005) \times CBR_{Nationmaster} (2005)}{TFR_{Nationmaster}(2003)} \tag{2}$$

and one such correction must be carried out for each individual nationality. We shall relinquish commenting on the complete absurdity of mixing data in this fashion, and confine ourselves to note that the calculations can technically be carried out.

But let us take a step back for a moment: what just happened? Nyborg unwittingly admitted fabrication of data. Indeed, he has now admitted that the Decay article does NOT use data from United Nations. This is data from United Nations multiplied with some unrelated factors that comes from somewhere else, and then it's not data from United Nations anymore. This is quite simply an example of data fabrication and scientific fraud.

Using Vig's constructed CBRs

Now that we know how Vig produced the list of CBRs needed to make the "annuity" formula (1) work, let us see how it works. Thus, we take the TFR values from the 2007 UN poster, the Nationmaster CBRs from 2005 and the Nationmaster TFRs from 2003, plug them into the conversion formula (2), and run the retro-corrected projection (1).

The results of the simulation are shown below. Our software can reproduce all the figures in the Decay-paper, but only some are shown here. The graphs shown in dashed lines have been read off the figures in the Decay paper.

When running the projection, Vig substitutes the birth rates of Western Nations with what he weirdly calls the "UN-recommended" birth-rates of 9.6 per 1000. In addition, that birth rate is stepped down after 2032 by 0.1 every 7th year. It is not clear exactly how this correction is used by Vig in his calculations. In the plots below, the CBR value 9.6 is used for Denmark only (indicated by the text "mode 1" in the plots below), we have used this because it seems to reproduce the data better. Nyborg has not been able to document where the completely unrealistic number "9.6" comes from.

Figure 1

The Decay papers figure 1 shows the average birth rates for the period 1979–2009. The figure shows the same three graphs as figure 14b in Vig's document perspektiv.pdf:

  • "Foreign birth rates, based on official UN statistics, and weighted by proportional representation in Denmark" (green)
  • The so-called "official ethnically mixed birth rates" 1979–2010 (blue)
  • The so-called "residual-estimated ethnic Danish birth rates" 1979–2010 (red).

Starting from the "retro-corrected" population numbers for the period 1979–2010, the birth rate of all foreigners residing in Denmark is constructed thus:

  1. compute the number of births by foreign parents by multiplying last years' population with the (constructed) CBR of the respective national (home country) crude birth rate,
  2. compute the birth rate by dividing by this year's population count from the projection,

ending up with this for year n:

$$CBR_{foreign} = \frac{CBR \times N(n-1)}{N(n)}$$

The "pure ethnic danish" birth rate is computed by:

  1. compute the ethnic danish population by subtracting the number of foreigners from the total population count given by SB (StatBank, Statistics Denmark). This number is referred to as the "residual danish population",
  2. compute the number of births by ethnic danish parents by subtracting the number of foreign birhts (as computed above) from the total number of newborns as reported by SB.
  3. compute the ethnic danish birth rate by dividing by the number of births computed above by the population count computed above:

$$CBR_{dk} = \frac{births_{SB}(n) - births_{foreign}(n)}{N_{SB}(n) - N_{foreign}(n)}$$

The last graph, which Nyborg calls the "official ethnically mixed birth rates", is computed from the total "retro projected" population numbers, by dividing the number of births for year n by the total population number for year n.

Attempt to reproduce Figure 1 from the Decay article, using crude birth rates constructed according to the recipe given by Nyborg in the so-called "addendum" allegedly sent to PAID. Dashed lines: data extracted from Figure 1. Solid lines: "retro-projection" using CBRs computed from equation (2).

As seen in the figure above, it is not possible to reproduce the data from the Decay article using Vig's constructed CBRs. Especially the foreign crude birth rate is extremely high in the Decay papers Figure 1, and it is not possible to reproduce with any source of CBRs available to us, nor with the method Nyborg claims his ghost author Vig has used. Furthermore, in Appendix Q we have proved that this graph is identical to that of Figure 14B of Vig's document perspektiv.pdf, which is said to result from Vig's mysterious Alsager fertility rate data. Thus, the foreign crude birth rate data from Figure 1 cannot be generated by the method Nyborg postulates but is directly plagiarized from Vig without reference or documentation.

In Figure 1, the so-called "residual-estimated ethnic Danish birth rates" are lower than we can reproduce in this simulation. This is related to the very high "foreign birth rate", because it results from a subtraction from the total birth number, which comes from SB.

The results of a similar analysis was given in Appendix R of our response to the DCSD (UVVU) of Aug. 14, 2013, in which we unsuccessfully attempted to reproduce Nyborg's figure 1 using UN CBR-data from 2002-2004, 2006-2007 and 2009-2010.

Figure 4

Figure 4 from the Decay paper shows the "retro-projected" fractional composition of various IQ groups versus time from 1979–2072. In Appendix Q we proved that this figure is identical to Figure 19 from Vigs document perspektiv.pdf. The number of graphs is the same (five), and the five IQ groups are identical with the groups shown in Vig's Figure 19.

Dashed lines: data extracted from Figure 4 of the Decay paper. Solid lines: "retro-projection" using CBRs computed from equation (2).

It is seen that the simulation with Vig's constructed CBRs cannot reproduce the data given in the Decay article. The most striking difference is the development of the "normal" IQ group (blue) which the Decay paper underestimates, and the "low" IQ group (purple) which the Decay paper overestimates. This refutes Nyborg's claim on the source of the CBRs that have been used to generate the retro-projection.

Figure 5

Figure 5 of the Decay article shows the retroprojected population sizes of the same 5 IQ groups as seen in Figure 4. The main differences appear in the "normal" (blue) and "low" (purple) IQ groups. In Nyborgs published figure (dashed lines), the two curves cross around year 2064, while in our simulation using Vig's constructed CBRs, they have not crossed in year 2072. Also, in the published figure, the growth of the "low" IQ group is much faster than the simulation. It can be conclude that the simulation using Vig's constructed CBRs cannot reproduce the data given in the Decay article.

Dashed lines: data extracted from Figure 5 of the Decay paper. Solid lines: "retro-projection" using CBRs computed from equation (2).

Figure 7

Figure 7 shows the postulated IQ decline of the Danish population during the timespan of the projection. Both the total population (blue) and the immigrants and decendants of immigrants (red) are shown. The simulation using Vig's constructed CBRs, which Nyborg claims was used to generate the results in the Decay paper, shows a slower decline than the data from the published figure. At the end of the simulation (year 2072), the "immigrant-IQ" is more than one IQ unit higher in the simulation than shown by Nyborg. Thus, the simulation using Vig's constructed CBRs cannot reproduce the data in the Decay paper.

Dashed lines: data extracted from Figure 7 of the Decay paper. Solid lines: "retro-projection" using CBRs computed from equation (2).


We have developed software that can carry out the "retro-projections" developed by Jørn Ebbe Vig, but has been published by Nyborg as his own work in the socalled "Decay article" [1]. Our software is readily capable of using any source of crude birth rate data.

The Decay article cites as the source of national CBRs a poster containing fertility data (TFRs) which can not be used as input to the "annuity model". After a year and a half of DCDS hearings, Nyborg became aware of this problem, and reported that CBRs had been computed from the TFRs on the UN poster by multiplying the data for each country by a quotient obtained from another source of population data. It is unknown why the readily available UN CBR data was not used.

Using our software, we have carried out a simulation using the "proportionally corrected" TFR data, and we have shown that the figures shown in the Decay article cannot be reproduced in this manner. The conclusion is that Nyborg is either lying or does not know the true source of the data used in the Decay article.


1. H. Nyborg, The decay of Western civilization: Double relaxed Darwinian Selection. Personality and Individual Differences 53(2), 118–125 (2012).


Skanderborg IV: Konklusionen har vi, nu skal der bare skaffes data

Vi så i sidste indlæg i denne serie at Nyborg i sin grundhypotese går ud fra som givet, at kvinder er dummere end mænd. Det er kun et spørgsmål om hvor meget. Men så er det jo praktisk at man på forhånd kan rigge undersøgelsen til, så man får det ønskede resultat. Hvordan det hænger sammen, skal vi kigge på nu.

Vi skal nemlig kigge på det udvalg af IQ-tests, også kaldet et testbatteri, Nyborg anvendte i sit Skanderborgprojekt, og som bestod (for de 16-18 årige) af 20 del-tests (subtests), hvoraf de 11 var standard intelligenstests (den såkaldte WAIS-R test). De 9 resterende subtests var diverse andre intelligens- og færdighedstests, hvoraf nogle er udviklet af Nyborg selv. Af disse 9 er de to imidlertid stort set identiske, det drejer sig om Nyborgs egen Rod-and-Frame dependence test og den ældre Rod-and-Frame field dependence test som i N=62 datasættet har en indbyrdes korrelation på 0,94.

Generelt kan man sige om de standardiserede intelligenstests, at de ikke kan bruges til at bestemme kønsforskelle med, af den simple årsag at de ikke er beregnet til det. De er tværtimod designet til at behandle mænd og kvinder ens. Det er et i sig selv interessant faktum, at mænd og drenge er bedre til opgaver af matematisk-rumlig karakter, og kvinder er bedre til opgaver med et verbalt indhold. Med mindre man på forhånd ønsker at definere matematisk-rumlige evner som en vigtigere del af intelligensen end de sproglige, må de enkelte subtest altså afbalanceres så de behandler kvinder og mænd ens. Altså, så en normalbegavet kvinde og en normalbegavet mand begge opnår en IQ på 100. Dette emne er behandlet tidligere i et indlæg i Århus Stiftstidende den 7. januar 2008. Da de 9 subtests, Nyborg tilføjede til de standardiserede WAIS-R tests således favoriserer mænd, vil det ikke være overraskende for de fleste, at den samlede test også vil favorisere mænd.

At det forholder sig således kan man overbevise sig om ved at af se på de effect scores vi kiggede på i sidste indlæg om Skanderborgprojektet. Foretager man beregningen af kønsforskellen i Nyborgs N=62 datasæt ved udelukkende at kigge på WAIS-R standard tests finder man:

WAIS tests only (11):
average effect size: 0.1015628
average IQ equivalent: 1.523441
95% confidence interval of effect size: -0.3991621  < d <  0.6022876
95% confidence interval of IQ equivalent: -5.987431  < IQ <  9.034314

IQ forskellen er med andre ord 1,5 ± 7,5 i mændenes (drengenes) favør — en forskel som i parentes bemærket ikke er statistisk signifikant. Kigger man nu på de 9 ekstra tests alene, finder man tilsvarende:

Nyborg tests only (9):
average effect size: 0.3486596
average IQ equivalent: 5.229894
95% confidence interval of effect size: -0.1540072  < d <  0.8513264
95% confidence interval of IQ equivalent: -2.310108  < IQ <  12.7699

Altså en IQ forskel på 5,2 ± 7,5 i mændenes favør, heller ikke en statistisk signifikant forskel, men alligevel en forskel, som er over 3 gange så stor som den man finder i den standardiserede test. De 9 ekstra subtests Nyborg har tilføjet påvirker derfor det samlede resultat i mændenes favør, en effekt der kaldes bias. Se iøvrigt figur samt figurtekst herunder.

At Nyborg selv erkender at sammensætningen af testbatteriet kan være et problem kan ses af at han i Nyborg (2005), p. 499 revser Jensen for netop at have begået samme fejl i en ældre undersøgelse:

»However, the female g lead disappeared after Jensen eliminated the unusually large number of test items favouring females in the General Aptitude Test Battery, and repeated the factor analysis. In other words, the female g superiority was an artefact due to test bias that favoured females.«

og i bogkapitlet fra 2003 skriver han (p. 195):

»As mentioned previously, verbal and spatial tests typically benefit females and males differently; and their simultaneous presence in a test battery would tend to balance out the sex biasing effect.«

Nyborg tror imidlertid ukorrekt at den bias i mændenes favør han godt ved findes i hans testbatteri forsvinder gennem den statistiske procedure han anvender ("hierachial factor analysis"):

»In other words, because the sources of variances due to test specificity and possible group factor biases are sorted out already at lower levels, the higher order g factor emanates as a largely uncontaminated function of general ability, reflecting mostly the variance that is common to all factors below.«

[Nyborg (2003) Sex differences in g, p. 197.]

I rapporten af 16. marts 2006 gør Det Sagkyndige Udvalg en hel del ud af at forklare at Nyborgs antagelse er forkert (p. 5):

»Nevertheless we want to put on record two clear mathematical facts that Nyborg does not seem to be aware of. First, there is an inherent unidentifiability in the g–factor method used by Nyborg, which makes it impossible to separately estimate sex differences in the mean of the g–factor from sex differences in the primary factors. Second, Nyborg claims that his version of the g–factor method avoids the problem that the conclusion will reflect gender bias in the composition of the test battery. However, we show in Appendix D that Nyborg’s g–factor method too has this undesirable property.«

Og videre, i afsnit 7.2 skriver Udvalget:

»Nyborg seems to think that when using the g-factor and somehow correlating this with the differences in the means of the test variables the above mentioned problems have been avoided. However, the method used by Nyborg based on the point biserial correlation inserted in the correlation matrix reduces approximately to a weighted sum of the effect sizes d (see Appendix D, formula (9)). Typically, these weights are almost all positive and, intuitively, what is measured is therefore the overweight of tests in the battery that favours one of the sexes. We are therefore precisely back to the problem that Nyborg purports to avoid. For the actual test battery that Nyborg uses we have a strong bias towards males with 16 or 17 out of the 20 tests giving a positive effect size.«

Det Sagkyndige Udvalgs konklusioner kan vist ikke misforstås. For det første (og her henvises til et matematisk bevis i Appendix D) kan den variabel der beskriver køn ikke identificeres i den faktoranalytiske metode, og for det andet elimineres den bias der eksisterer i testbatteriet ikke gennem den statistiske vridemaskine.

Naturligvis ikke.

Til venstre ses et boxplot af gennemsnitlige effect-scores når udelukkende WAIS-R subtests tages i betragtning. Det ses, at medianen for kvindernes effect-scores faktisk er højere end mændenes, omend spredningen er højere, og, som tidligere påpeget, at populationen af de 31 kvinder ikke er repræsentativ. Til højre ses boxplottet som det fremtræder når udelukkende Nyborgs ekstra 9 subtests medtages. Disse tests favoriserer tydeligvis mænd, som ligger væsentligt over kvinderne, samtidig med spredningen er væsentligt mindre.

Skanderborg III. Er kvinder dummere end mænd? Næh.

I dette blogindlæg skal vi atter kigge lidt nærmere på Nyborgs data fra Skanderborg-projektet, hvorpå han byggede påstanden om kvinders angiveligt lavere intelligens, en påstand han siden bevidstløst har gentaget som et uomtvisteligt faktum. Og hvis man modsiger ham, er man medlem af en uhyggelig marxistisk/politisk korrekt sammensværgelse, der blot har til hensigt at stække hans akademiske ytringsfrihed.

Nyborg fokuserer stort set udelukkende på den såkaldte g-faktor, en statistisk størrelse som kan udregnes af intelligenstests (eller andre tests) fra en gruppe af personer — den kan ikke bestemmes fra en enkelt person. Enkelte psykologer lægger en meget stor betydning i denne statistiske størrelse, som de tror er et mål for en eller anden indre, uforanderlig og genetisk arvelig egenskab. Nyborgs beregninger, som resulterede i det såkaldte "loading af køn på g på 0.27" — de berømte 27% — er meget uinteressante for alle andre end følgere af denne g-religion. Derfor vil vi udelukkende analysere Nyborgs data på mere traditionel vis, som er langt lettere forståelig, og som i sidste ende viser helt det samme som Nyborgs resultater... uden at gå omvejen omkring den besynderlige g.

For at slå syv kors for mig, i tilfælde af, at nogen tror det er løwn at g-faktor beregningen er overflødig, se da hvad det sagkyndige udvalg skrev i DSUR (side 16, øverst):

»Nyborg seems to think that when using the g-factor and somehow correlating this with the differences in the means of the test variables the above mentioned problems have been avoided. However, the method used by Nyborg based on the point biserial correlation inserted in the correlation matrix reduces approximately to a weighted sum of the effect sizes d. «

Derfor vil vi nu kigge på Nyborgs data omregnet til de såkaldte effect sizes, som er ret lette at forstå. Man starter med filen Adult_sex.xls, som indeholder Nyborgs rå testdata for 62 forsøgspersoner, kategoriseret som "voksne" (men som reelt stammer de fra to forskellige grupper 16-18 årige skoleelever), her afbildet i tabel 5.

Man kan se i tabel 5, at point-givningen i de forskellige tests er ret forskellig. Nogle tal, i test 7 og 8, er noget med hundrede, og tests 1-4 har negative pointtal. Tests 9-20 (som er de standardiserede WAIS intelligenstests) har typisk point af størrelsen 0-20.

For hver IQ-test (hver søjle i tabel 5) beregner man nu gennemsnittet, og dette gennemsnit trækkes fra alle scores i søjlen. Dernæst udregnes standard-afvigelsen af hver søjle, og alle scores i søjlen divideres med denne værdi.

Resultatet af denne beregning er en ny tabel (tabel 6), hvor alle tests har middelværdien nul og spredningen 1; de er nu omregnet til de såkaldte effect sizes. Pointene fra de enkelte tests er med andre ord på samme skala, de forskellige tests kan nu sammenlignes, og man kan beregne en gennemsnits-score for hver forsøgsperson.

Vi kan således nu analysere disse gennemsnitsscores som hver person opnår. I første omgang er det god praksis at lave et såkaldt boxplot af sine data, så kan man få et hurtigt overblik over, om der er noget underligt ved ens data. Et sådant boxplot ses i figuren til højre. Til venstre i figuren ses mændenes gennemsnits-score (hvis man kan kalde 16-årige for "mænd"), scoren for hver person er markeret med en blå prik. Højden angiver hvor god en score drengen har fået, jo højere på grafen, jo bedre score. Til højre ses en tilsvarende analyse af pigernes score.

Det der er afbildet i den grå boks er alle der ligger i 2. og 3. kvartil, og de punkter der ligger uden for, hører enten til den eller den dårligste fjerdedel. Den mørke tværstreg angiver medianen som i dette tilfælde er den score, som halvdelen ligger over og halvdelen ligger under.

Figuren viser ret tydeligt, at pigernes fordeling ser lidt underlig ud. Der er en stor hale af piger, som har et dårligt gennemsnit. Tilmed er pigernes fordeling "skæv", hvorimod drengenes — som man ville forvente — er mere symmetrisk. Der er tilsyneladende noget galt med Nyborgs pigegruppe. Den er ikke repræsentativ.

Så ta'r vi til Monte Carlo!

I eksperimentet udvælges 10.000 gange tilfældigt 2 grupper på hver 31 personer, og forskellen i gennemsnitsscore beregnes. Værdien t langs x-aksen er et udtryk for sandsynligheden af den observerede forskel i gennemsnitsscore på de to permuterede grupper af forsøgspersoner, hvis hypotesen er, at de er ens. Der er flest simuleringer der lander omkring t=0, hvilket svarer til en sandsynlighed på 50%. De grupper der ligger længst til højre, er parvis næsten ens (de opstår ikke så ofte), så sandsynligheden for at tilbagevise hypotesen er meget lille. Og de grupper, der ligger længst til venstre, er mest forskellige (de opstår heller ikke så ofte). Sammenligningen af 31 drenge og 31 piger er markeret med en prik (alle andre 9999 grupper består af en blanding af drenge og piger.) For dem kan man tilbagevise, at de har samme gennemsnit med en sikkerhed på 90%, men det er altså ikke nok til at kalde det statistisk signifikant. Mange af de "tilfældige" grupper ligger faktisk længere til venstre.

Når man har gruppe-opdelte data som i dette tilfælde, hvor vi har to grupper, kan man lave en såkaldt en Monte Carlo simulering, hvor man tilfældigt tilordner forsøgspersonerne i de to grupper. Der er én og kun én måde at opdele forsøgspersonerne på, så alle drengene er i én gruppe og alle pigerne er i en anden. Vi kan undersøge hvordan beregningen forløber, hvis vi undersøger en hel masse andre grupper af 31 + 31 forsøgspersoner, som så altså vil indeholde både drenge og piger.

Resultatet af en kørsel, hvor der sammenlignes 10.000 par af to grupper med 31 personer i hver, kan ses til højre. Den sorte prik viser den værdi, som fås når man sammenligner Nyborgs grupper af 31 drenge med 31 piger. Konklusionen af kørslen er, at 10.6% (two-tailed) af kørslerne består af to grupper, som er indbyrdes mere forskellige i deres gennemsnitlige score end Nyborgs grupper er. Standarden er, at man skal under en værdi på 5% for at man kan betragte resultaterne som statistisk signifikante.

Resultatet kan faktisk checkes ved at foretage en enkelt t-test, som giver en p-værdi på 0.1033. Dette er (igen) større end 0.05, og forskellen i scores på drenge- og pigegruppen er altså ikke signifikant.

Hvordan man snyder med signifikansen

Det er uklart hvorledes Nyborg har beregnet den p-værdi på g-faktor loading på køn der citeres under tabel 1 i 2005 artiklen ("Significant at p (one-sided) = .016"). Men bemærk, at Nyborg har valgt den såkaldte "one-sided" statistiske test. Den har den for Nyborg behagelige kvalitet, at den er præcis halvt så stor som den "two-tailed" alle andre bruger. De eneste tilfælde, hvor det er relevant at bruge en "one-tailed" statistik er hvis man med usvigelig sikkerhed véd, at man måler en forskel som er positiv. Det kunne f.eks. være, at man måler en afstand, som af gode grunde ikke kan være negativ. I så tilfælde er det relevant at benytte en "one-sided" statistik, ellers ikke.

Når Nyborg benytter en "one-sided" statistik betyder det, at han på forhånd antager, at kvinder er dummere end mænd, det er kun et spørgsmål om hvor meget.

Kan det ikke være ligemeget med den signifikans?

Hvad så med den signifikans, om den er 5% eller 10% kan det ikke være ligemeget? Der er vel noget om det alligevel? Der er ingen tvivl om, at drengegruppen scorer bedre end pigerne?

Det er rigtigt at drengegruppen scorer bedre. Og det p-værdien siger er, at man med en lille sandsynlighed (ca. 10%) vil tage fejl når man siger at forskellen mellem drenge og piger ikke er tilfældig.

Lad så det være det. Signifikansen er trods alt blot et spørgsmål om Nyborg kan drage de konklusioner han ønsker at drage i hans lille sample på 62 personer. Noget andet er, når Nyborg dernæst vil påstå, at resultatet er repræsentativt for hele befolkningen. Alle kender politiske meningsmålinger. Til disse anvendes typisk ca. 1000 personer, og det giver det der kaldes en statistisk usikkerhed på ca. 3%. Man kan selv regne tallet ud... det er rundt regnet kvadratroden af 1000 divideret med 1000. På samme måde er Nyborgs undersøgelse af 62 forsøgspersoner belagt med en statistisk usikkerhed, når han forsøger at sige noget om hele befolkningen.

I artiklen fra 2005 (tabel 1) oplyses det, at den gennemsnitlige effekt-størrelse er 0.21, hvilket svarer til en gennemsnitlig IQ forskel på drenge- og pigegruppen på 3.15, til drengenes fordel. Men Nyborg undlader bekvemt at beregne konfidens-intervallet på effekt-størrelsen, så det gør vi da bare, for konfidens-intervallet indregner nemlig den statistiske usikkerhed på den meget lille gruppe af 62 forsøgspersoner:

All 20 tests:
average effect size: 0.2127563
average IQ equivalent: 3.191345
95% confidence interval of effect size: -0.2888424  < d <  0.7143551
95% confidence interval of IQ equivalent: -4.332636  < IQ <  10.71533

Ønsker man at udtale sig om hele befolkningen, som Nyborg jo gør, kan han kun hævde, at forskellen på mænd og kvinders IQ ligger på mellem ca -4.3 og 10.7. Alså igen: Er kvinder dummere end mænd? Næh. Nyborg har ikke bevist noget som helst.


Subscribe to RSS