Having just finished another pile of conference-paper reviews, it strikes me that the single most common stylistic problem with papers in our field is the abstract.
Disappointingly few Computer Science authors seem to understand the difference between an abstract and an introduction. Far too many abstracts are useless because they read just like the first paragraphs of the “Introduction” section; the separation between the two would not be obvious if there were no change in font or a heading in between.
The two serve completely different purposes:
Abstracts are concise summaries for experts. Write your abstract for readers who are familiar with >50% of the references in your bibliography, who will soon have read at least the abstracts of the rest, and who are quite likely to quote your work in their own next paper. Answer implicitely in your abstract experts’ questions such as “What’s new here?” and “What was actually achieved?”. Write in a form that squeezes as many technical details as you can about what you actually did into about 250 words (or whatever your publisher specifies). Include details about any experimental setup and results. Make sure all the crucial keywords that describe your work appear in either the title or the abstract.
Introductions are for a wider audience. Think of your reader as a first-year graduate student who is not yet an expert in your field, but interested in becoming one. An introduction should answer questions like “Why is the general topic of your work interesting?”, “What do you ultimateley want to achieve?”, “What are the most important recent related developments?”, “What inspired your work?”. None of this belongs into an abstract, because experts will know the answers already.
Abstract and introduction are alternative paths into your paper. You may think of an abstract also as a kind of entrance test: a reader who fully understands your abstract is likely to be an expert and therefore should be able to skip at least the first section of the paper. A reader who does not understand something in the abstract should focus on the introduction, which gently introduces and points to all the necessary background knowledge to get started.
A (ficticious) bad example:
Intrusion detection with neural networks and fuzzy logic
Abstract: With the continuous growth of the Internet, security intrusions become an ever bigger problem for the information society. Intrusion detection systems are intended to alert system administrators to suspicious events in log files, to help in rapid discovery and remediation of security incitents. In this work, we have used a novel type of neural network combined with a fuzzy logic classifier. Be believe that this approach can substantially improve the state of the art.
The same paper could have been abstracted for experts in a much more informative way:
Intrusion detection with neural networks and fuzzy logic
Abstract: In the learning phase, we fed our FuzzyIDS with the system-call section of the BLAFAS’05 competition log-file training corpus. We first normalized filenames using Hugh’s method, then converted function call parameters into 6-element feature vectors using a slight modification of the SniffIt 3.1 preprocessor. The resulting 3200 vectors were randomly split into four groups to train four instances of the 4-layer backpropagation network in the GNU R neural-network toolbox. Each trained network was then fed again with all 3200 vectors, and the resulting output used to train McCaigh’s FuzzyClass classifier. The recall rate achieved by FuzzyIDS on the test set is 34% better than the BLAFAS’05 winner, at a comparable CPU load.
The first example gives no clue about what was actually done in the presented work, while the second gives readers a very quick idea of whether they are interested in the work and if so what they need to learn from the introduction before they can fully understand it.
Write the abstract last. It is conceptually much closer to the conclusion than the introduction, therefore it is best written after the conclusion is finished.
Abstracts should stand on their own. Many expert readers will not have time to read more than your abstract. Do not use numeric references to bibliography, sections, or even footnotes in the abstract, because users of abstract databases may not have instant access to the full paper. Also avoid complex mathematical notation (subscripts, fractions, etc.), because abstract databases are unlikely to render them correctly.
Writing any kind of paper is quite a difficult task in itself. Let alone writing the abstract, which is, as you say, the only part that becomes public for free (due to tradition and publishers).
As you note just about the only initial contact people will have with your paper is the abstract, often in form of a search result in a citation / abstract database on CD/DVD or online.
I used to work for the founding company in that game and I know from experience that the abstracts have many failings, not just the two you mentioned in your last paragraph.
Unfortunately citation / abstract databases are the way more and more young researchers are finding papers (often it’s the way they are shown by their Uni Learning Resource Center).
So the abstract is becoming critical not just to your paper but to your career and it should not be, which is unfortunately an issue that you can lay at the paper publisher’s door. In general they only allow the abstract to become public and as they want to minimise their costs they place quite severe limits on its size.
To be really useful the abstract, introduction and optionally the conclusion need to be searched and seen by the researcher. In the past with paper publications held in the Uni library this was not too much of an issue as the abstract acted as just a “taster” or pointer, today however this is often not the case and a researcher or student has to use an online or CD service.
As paper writers you should be pushing the publishers to make the full text of the paper available to the citation / abstract database companies to provide full text searching. Also you need to push for more than just a 250 word abstract to be made available in the search results, but as a minimum a more length introduction and ideally part or all of the conclusion.
Otherwise you will have to change the old maxim “publish or perish” to something new as publishing alone just will not get you “noticed and quoted” any more.
If ‘young researchers’ and other people interested in finding good sources quickly are spending some time looking through online abstract services, then surely they’ll get a good idea about how to write a good abstract quite quickly: a good abstract is one that follows the style of those associated with the useful sources you found easily.
So if young researchers are making good use of abstract services, then they ought to have a good (if ostensive…) idea about what an abstract ought to contain.
But I agree with Markus: it’s annoying when you come across an abstract that reads more like the back of a novel: ‘will we ever find out whose FuzzyClass classifier was trained by the resulting output?!’
A good presentation on how to write a research paper from Simon Peyton-Jones
http://research.microsoft.com/~simonpj/papers/giving-a-talk/writing-a-paper-slides.pdf
Simon Peyton-Jones offers a different perspective on how to write abstracts (p9 of https://web.archive.org/web/http://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/granitzer/How_to_write_a_good_research_paper.pdf). Markus, what do you think about Simon’s perspective.
Peter Lewis writes:
But I agree with Markus: it’s annoying when you come across an abstract that reads more like the back of a novel: `will we ever find out whose FuzzyClass classifier was trained by the resulting output?!’
This is an interesting comparison. The back covers of books contain just a few sentences very carefully crafted to catch the browser’s attention and get them to read the book.
It is very telling that some (or indeed many) academics appear not to want this advertising service from an Abstract. But yet we all want our papers to be read and noticed, don’t we?
Maybe one day papers might have two brief paragraphs on the front? An “advert” and a “synopsis”.
Mike.
There is a reason why they are called “computer science” students, and not “humanities” or “arts” students. Seriously though, at the undergraduate level there is exactly one desideratum for humanities students: the ability to clearly articulate an argument. No such similar emphasis exists for the sciences, much less the engineering/IT sciences.
Your suggestions for an abstract are exactly right. It does no good to try to pressure a publisher into offering full-text for indexing, since that would defeat the purpose of a well-written abstract. The abstract SHOULD include all the keywords, and SHOULD give a clear distillation of the emphasis of the paper, so a potential reader can assess the likelihood of substantial value arising from reading the full text.
Dear Mr. Markus:
I am so impressed with your advise that I can not resist translating its main content to Chinese and sharing it with my friends in my blog. The original link of your post is refered in my post. Please allow me do it. Thanks!
Sincerely
Yishuai
What an excellent description of how to write an abstract! Thank you , thank you, thank you. I will share it with all my doctoral students.
Is the misprint (“Be believe”) in the “bad” abstract deliberate? I fully endorse the message, having spent some time (20 years ago) at another university trying to promote better use of IT to support and improve research-student writers – with considerable pushback from supervisors who resented “interference”. In respect to PhD theses, the “tell us what you achieved” angle needs even more emphasis. Students mechanically sign the claim that “this thesis is my own work” (while acknowledging supervisor, other staff, technicians) but are browbeaten into depersonalized academic writing. My favourite Abstract read – almost literally – “Samples were collected, samples were analysed, results were obtained, results were interpreted. These findings add to domain knowledge.”