Tuesday, 8 July 2008
Data, Information, Knowledge, Wisdom
"Information is both more extensive than data and many instances of it are logically stronger than data. Information is irreducible to data. [...] This makes knowledge and information synonymous. Knowledge and information collapse into each other"
"And the wise person must not only have wide appropriate knowledge, but they must act in accordance with the knowledge they have."
The article also mentions evidence, but in a different context to the "evidence-based practice" use - this is more related to knowledge (some discussion of whether this means "know-that" or "know-how") and wisdom.
http://dlist.sir.arizona.edu/2327/01/The_Knowledge_Pyramid_DList.pdf
Thursday, 26 June 2008
ISKO event on information retrieval
Went along to some of the ISKO event on information retrieval today...
Brian Vickery was up first but unfortunately, I missed most of his talk. I did catch the last few minutes though where he asked some very pertinent questions:
- What is the case for building classifications, thesauri and taxonomies? How does this relate to the needs of Communities of Practice?
- Are the benefits of controlled retrieval languages strong enough to justify the effort and cost of creating/maintaining/using them?
- Is there a growing need to harmonise or match terminologies?
- What is the future for "universal" controlled languages and general classifications/ ontologies?
Next up was Stephen Robertson, giving a researcher perspective. He pointed out that although web search engines have been very successful, other systems cannot say the same - perhaps because the extensive machine learning available to Google et al just isn't feasible for a smaller setup. Roberston mentioned some useful sources of evidence in evaluating retrieval - notably click-throughs and "dwell time" (how long a user spends somewhere before returning to search results). There is some rich data out there but it is also "noisy".
Last up was Ian Rowlands who talked about the implications of the Google Generation report. He started with some context - insecurity around the power of Google, Yahoo branding; devaluing of the "library" brand; the hypothesis that the younger generation is somehow different. He referred to various pieces of research including Carol Tenopir's long-standing survey of academics. The bottom line of the Google Generation report is that it is a myth - yes, there is a type of user behaviour which is comfortable online but the Google Generation is not a homogenous mass of people - "silver surfers" (another irritating term!) demonstrate characteristics too and there are also "digital dissidents" among younger generations who are shunning technology. So, the general message is to stop thinking of our users as fixed targets who fit some kind of stereotype. We need to understand user behaviour much better, in particular, online reading - but then, how much do we really understand about how people read/absorb information in print? How can we be sure what we learn about online reading is peculiar to an online environment and isn't just typical of reading in whatever format?
Rowlands also suggested that we need to help users form "mental maps" of information - typically, when you walk into a library for a print resource, you have a reasonably good image of what you are expecting to find - the same can't be said of the web. There is a message for librarians here to help create easier access to information for users e.g. through less confusing terminology. Information literacy is key but research seems to suggest that unless individuals learn from a young age, the changes possible in user behaviour are more limited. There have been studies demonstrating a correlation between information literacy and academic grades.
Rowlands finished with a plea to understand our users better - stop thinking of them as one big mass which can be served by a one size fits all solution and learn from the commercial world, where customers are segmented and can follow a number of routes to information - though, I have to say, the commercial world doesn't always get it right either and they have greater resource at their disposal.
Thursday, 3 April 2008
BT using social networking internally
New classification scheme for research in Australia and New Zealand
jointly developed ANZSRC to serve as a standard research classification for both
countries. It will improve the comparability of research and development statistics
between the two countries and the rest of the world. For the ABS and Australian
stakeholders, ANZSRC replaces the Australian Standard Research Classification (ASRC
1998) and for Statistics NZ and New Zealand stakeholders ANZSRC introduces a new
framework to measure R&D activity."
http://www.ausstats.abs.gov.au/ausstats/subscriber.nsf/0/2A3A6DB3F4180D03CA25741A000E25F3/$File/12970_2008.pdf
Tuesday, 29 January 2008
BiomedExperts
http://www.biomedexperts.com/Portal.aspx
Monday, 7 January 2008
Open science: implications for librarians
I like the take home messages:
- Open science is driving transformational change in research practice: now
- Curating open data requires strong Faculty links and multi-disciplinary teams: Library + IT + Faculty
- Recognise and respect disciplinary differences: get to know the data centre people, new partnerships
- Libraries have a lot to offer: build on your repository experience
- Data underpins intellectual ideas: we must curate for the future
Monday, 10 December 2007
Knowledge Discovery Resources - Marcus Zillman
http://www.kdresources.info/
Friday, 16 November 2007
Chris Date Lecture @ NeSC
"Bill Pike (Pacific Northwest National Laboratory), in his presentation on integrating knowledge models into the scientific analysis process [...] described the challenge of trying to capture scientific knowledge as it is created, with workflow models that describe the process of discovery. In this way, the knowledge of what was discovered can be connected with
the knowledge of how the discovery was made."
"If future generations of scientists are to understand the work of the present, we have to make sure they have access to the processes by which our knowledge is being formed. The big problem is that, if you include all the information about all the people, organisations, tools, resources and situations that feed into a particular piece of knowledge, the sheer quantity of data will rapidly become overwhelming. We need to find ways to filter this knowledge to create sensible structures... "
"One method for explicitly representing knowledge was presented by Alberto Canas (Institute for Human and Machine Cognition). The concept maps that he discussed are less ambiguous than natural language, but not as formal as symbolic logic. Designed to be read by humans, not machines, they have proved useful for finding holes and misconceptions in knowledge, and for understanding how an expert thinks. These maps are composed of concepts joined up by linking phrases to form propositions: the logical structure expressed in these linking phrases is what distinguishes concept maps from similar-looking, but less structured descriptions such as "mind maps". "
Friday, 2 November 2007
Latest Ariadne : NaCTeM, repositories and KIDDM
Good to see NaCTeM :-) A good overview of the current services and a run-through their roadmap:
"NaCTeM's text mining tools and services offer numerous benefits to a wide range of users. These range from considerable reductions in time and effort for finding and linking pertinent information from large scale textual resources, to customised solutions in semantic data analysis and knowledge management. Enhancing metadata is one of the important benefits of deploying text mining services. TM is being used for subject classification, creation of taxonomies, controlled vocabularies, ontology building and Semantic Web activities. As NaCTeM enters into its second phase we are aiming for improved levels of collaboration with Semantic Grid and Digital Library initiatives and contributions to bridging the gap between the library world and the e-Science world through an improved facility for constructing metadata descriptions from textual descriptions via TM."
Other interesting snippets:
- SURFshare programme covering the research lifecycle http://www.surffoundation.nl/smartsite.dws?ch=ENG&id=5463
- a discussion on the use of Google as a repository : "Repositories, libraries and Google complement each other in helping to provide a broad range of services to information seekers. This union begins with an effective advocacy campaign to boost repository content; here it is described, stored and managed; search engines, like Google, can then locate and present items in response to a search request. Relying on Google to provide search and discovery of this hidden material misses out a valuable step, that of making it available in the first instance. That is why university libraries need Google and Google needs university libraries."
- feedback from ECDL conference, including a workshop on a european repository ecology, featuring a neat diagram showing how presentations are disseminated after a conference using a mix of web2.0, repositories and journals http://www.ariadne.ac.uk/issue53/ecdl-2007-rpt/#10
Wednesday, 19 September 2007
Report from BCS KIDDM Mash-Up
Peter Murray has written up some of the day's presentations on his blog.
Conrad Taylor, introducing the day, covered issues around mark-up and tagging, referring to the difficulties of marking up audio/video and unstructured text; time constraints; and difficulties of subject classification.
Tony Rose talked about information retrieval and some of the innovative approaches out there:
- semantic searching - as demonstrated by hakia and lexxe
- natural language processing - as demonstrated by powerset and lexxe
- disambiguation - as demonstrated by quintura
and ask - assigning value to documents - as demonstrated by google
He sees future of search as addressing the following:
- rich media search
- multi/cross lingual search
- vertical search
- search agents
- specialised content search
- human UI
- social search
- answer engines
- personalisation
- mobile search
Tom Khazaba from SPSS talked about their products for text and data mining and the various applications they're used for (CRM, risk analysis, crime prevention etc). He stressed that the results of text analysis have to be fitted into business processes and mentioned briefly how Credit Suisse have achieved this. He listed the keys of success of text/data mining solutions:
- ease of use
- supports the whole process
- comprehensive toolkit - ie features visualisation, modelling etc so all you need is in one place
- openness - using existing infrastructure
- performance and scalability
- flexible deployment
Dan Rickman introduced geospatial information systems. He referred to the importance of metadata and ontologies for handling the large volumes of unstructured data. In geospatial information, there is also a temporal aspect as many applications will view an area over time. He mentioned OS' work on a Digital National Framework which has several principles:
- capture information at the highest resolution possible
- capture information once and use many times
- use existing proven standards etc