Taxonomy Meetings 2011 – A Year of Change or Realization?
January 9, 2012
Posted in Access Insights, Business strategy, Featured, reference, Taxonomy
What are the meetings that cater to people who use controlled vocabularies, like taxonomies? Where should a taxonomist go, click, or attend to learn about the latest implementations and uses of controlled vocabulary strategies? Every company thinks long and hard both about what they do and where to find customers for their products and services. The Information Industry is no different. In the Age of the Internet when everyone’s “knows” about searching and information; it seems like the “information Industry” should be booming, its conferences should be huge, and the attendance incredible, but that is not the case. Why? If the information industry and our little taxonomy segment of the business has gone mainstream, then where are all the people you would expect at the long established industry meetings? The meetings we have attended for years are dying on the vine. The SLA Expo was sparse, the Information Today meetings are smaller, Online Information (formerly International Online) was nearly empty, and NFAIS remains the same size each year. ASIS&T is growing significantly. Frankfurt Book Fair is bigger than ever. Specific User Group meetings are increasingly targeted and well attended.
I believe there are several factors at work. The diminishing meetings have had challenges for years. Nor are we alone in this trend. It is national and perhaps international. Other options are now available. Nationally 126 million people attended meetings in 2009. In 2010 only 80 million attended. “There were 12 percent fewer attendees in 2010 than in 2005 – and 19.7 percent fewer in 2009 than in 2006,” notes the Baltimore Sun. The trend is downward significantly even with the problems of the economy. Let’s take them one at a time.
SLA Conference and Expo – an expensive and glitzy meeting held in conjunction with the SLA (aka Special Libraries Association) annual meeting. This meeting has had as many as 7,000 attendees and many auxiliary events such as user group meetings and advisory boards surrounding it. The meeting itself is significantly smaller now. The membership itself is down to half its numbers from 14,000 to about 7,000 including an unknown, but likely held steady, at about 2000 student quotient. This means they no longer command as large an audience. In spite of a well-meaning board trying to cater to the un- and underemployed by reduced fees, the membership has been shrinking. The Expo has held two functions in the industry. One, of course, is to show the companies wares to the attendees, people who work in corporate and other kinds of unusual libraries and often command large purchasing budgets. Second is the meeting of most of the players in the industry in a single exhibit hall allows for intellectual property rights discussions and business arrangements/deals to be made. But several things have happened to make this a less attractive venue.
Years ago SLA mandated that a company could not sponsor a division’s activities, that is get close to the real customer group, unless you were an exhibitor. That meant paying for the booth (about $5000 for the smallest), paying for furniture, electric, carpet, Internet, card reader, plus the art and brochures, and giveaways, etc. (much more than $5000). Then you need staffing for the booth including airfare, hotel for at least two, but usually more staff. (Another $5 – 7,000 in direct cost per person plus a week away from the office.) After that you get to be the target of every division to sponsor their events – at $500 – $5000 each (there are 28 divisions and almost all of them will call you). So SLA needs to be at least $30,000 line item in the budget, but is usually over $50,000 plus staff labor and opportunity cost. The business aspect of companies (a less degrading label than “vendors”; What are we circus performers?) talking with companies has been good, but the increasing number of companies “suitcasing” (that is, without a booth), has made the exhibitors targets of not only the divisions and SLA, but also those who did not pay the freight to be in the show. Meanwhile, the attendees are walking the aisles, looking for giveaways, not making eye contact since they have no budget to spend.
More recently the Divisions have realized that they could get more out of their target companies, if they held out the carrot of a speaking slot. If you pay X you are a sponsor, if you pay Y you can also have a speaking slot. That all works as long as there is a large audience to talk with. But over the past two years there have been very few people attending the meetings. The sessions of substance are well attended. I went to the Taxonomy related ones and they were often standing room crowds. Buying of speaking slots, however, degrades the programming options and also makes the exhibitors feel cheap. My expertise, which I have been able to found and run a company on, is only worth hearing, if I pay you to listen? It feels like some kind of prostitution going on here!
At SLA 2011 many sessions had to do with how to get a job, get a raise, change careers, etc. These are helpful to the out of work perhaps, but NOT a persuasive reason for an employer to send a staff member to the meeting. Why should they send their staff to a meeting to learn how to get a different job? The early program was full of such sessions and a turn off to many of the employers and potential attendees I spoke with. They need to send people to the meeting for a skills and industry update and refresher.
So few attendees because the programs are not delivering content and while business discussions for exhibitors have held them in the hall for the past few years, is that enough to make the show a go? Here are new options out there as you will see later in this article.
Over laden with regulations, booth fee increases, and limited staff resources, have resulted in a thin meeting on top of an already downward fiscal spiral for SLA. Can they pull it out? Perhaps they can, but probably not with the current strategy. That exhibit hall finances much of the SLA annual operations. An organization which gets more than half of its annual income from a single face-to-face meeting in the Internet age has some hard thinking to do.
Information Today built its reputation on the once premier meeting in the industry – the National Online Meeting. It sprang into being when SLA and ASIS&T missed the rise of online searching and the incipient internet offerings as a potential big force in their lives. More recently this meeting has been cut into sections and targeted to specific groups like “Computers in Libraries, Internet Librarian, Taxonomy Boot Camp, Knowledge Management World,” and etc. Each of them seems to draw a small, but loyal crowd of attendees. The business aspect of the meeting has been lost, not much deal making goes on here, and the exhibits are shrinking. Here too, if you are a potential exhibitor, you are generally not allowed a speaking slot unless you pony up for a booth.
This has led to a platform of consultants, who plead inability to exhibit, hawking their services from the podium. The quality of the program is diminished and the people with industry knowledge look for another avenue to get to the customer. The previous model of perhaps if they were speaking, they might also exhibit, has changed to no speaking unless you exhibit. Further the segmentation of the meeting has meant that the exhibitors cannot form the deal making side of attendance that is so important to their livelihood.
International Online was also an Information Today meeting (okay, Learned Information and when they sold it they had to change their name to the very successful industry newspaper they publish – not a bad thing). Traditionally held the first week of December in London, it was THE place to be for the buying and selling of digital rights and to see what new things were being released in the New Year. A vibrant, exciting meeting with a crush of people, big parties in the evenings, cutting edge presentations, and many user group meetings surrounding the IOM. One person commented that about 90% of the Intellectual property rights deals and changes for the year happened in that week in December. This year the meeting was a shadow of itself. Most of the big players did not exhibit, very few people walked through the hall. If you set up lots of meetings in advance, it was okay, otherwise a dud. What happened?
It became two unconnected meetings. One was the conference with delegates (attendees) held on the third floor a block away from the exhibit so the attendees seldom came down through the wet London cold to the exhibition. At the same time it became very expensive! Greed in the face of an economic down turn certainly plays a role, but this is not the only factor. Next year it is moving to Docklands from the Olympia and changing the format and venue. The meeting we knew is gone.
NFAIS has gone a different way. It is a membership organization of about 120 companies. But the leverage of the intellectual value added including controlled vocabularies is not the current focus of these former abstracting and indexing organization’s meeting. Their focus is on the “next big thing,” the trends in the industry. The program committee does NOT select member companies to speak. So if you are a member, you will not be on the podium except as a possible moderator. But NFAIS members would like to hear from members who are in a similar situation and find out how they have dealt with the problem. It is a cutting edge meeting, well planned and thought out, but does not grow due to self-imposed limits.
ASIS&T, the American Society for Information Science and Technology is often considered an academic meeting where professors can get their students’ papers on the program to showcase them. The Board is academic. The members are a mix, more academic than practitioners, but still a fair number of people looking for new technologies and a way to implement them on the home front. I used to survey the audience and decided it was in three segments. The academics sat in the front of the hall ready to comment and debate with the speaker, the practitioners and managers in the middle soaking up what they could from the presentations and questions, and the entrepreneurs and other misfits in the back, standing or on the aisles with an easy exit plan. It is still that way except that the middle has thinned out considerably. The meeting this year was a pleasant surprise on many fronts. It was a substantive program. Lots of hard hitting application and real life talks, less of the presentations on a sample of 10 – 30 and extrapolating unrealistic results. The talks were longer – 30 minutes and allowed enough time to actually describe the substance and then have penetrating questions. The student papers were moved to a huge poster session – 92 posters replacing the Presidential reception with dinner in the middle and posters around the edge – great for conversations, good learning experiences. Well done. Some even had to do with taxonomies.
But for a lot of application and implementation discussions, the action has moved to the ASIS&T IA Summit. The information architecture meeting now has as many attendees as the annual meeting (around 700 people) and has its own Web site and branding. Here it is far less academic and much more hands on discussions. I found the meetings clannish, but the discussions were worth listening to.
Frankfurt Book Fair – a few years ago this meeting was only for print publishers, although it was THE meeting for print. But as digital media has taken hold a new pavilion was added and the digital activity in Building 4 is now incredibly active. The rights trading is definitely done at this show now. The parties and the satellite meetings have mostly moved here. Publishers and the Online community have merged to be here in Frankfurt in October.
User Group Meetings – remember they used to be satellite meetings around the bigger meetings, but their members were no longer attending the big meetings. They now go for the shorter, pure vendor update, and presentations, which deal directly with their service, product, or software. They use these specialized events to learn what’s new and how to use it better. It pays off back at the office and you meet others who are using and leveraging the same things. I attended several of these during the year. They were uniformly well attended by enthusiastic people wanting to know more about the products and services so they could better manage their investments. Meetings that are viable are those that engage the attendee and the User Group Meetings. I attended several this year and they are of two types. 1) those which follow the rock star level of presentation – like MarkLogic and SilverChair, 2) and those which are hands on updates on the applications and use cases to leverage the customer investments like Atypon and Data Harmony.
Summary:
Okay great – we know where the companies are going to get their work done, make deals, and to learn new things, but what about the individual? Where are they going? What are they now doing to learn and keep skills fresh?
The Internet has made many things possible that were not possible before. We can convene a meeting electronically in a very short time. We can have discussions over Skype or Webex or GoToMeeting. We can develop documents using collaborative wikis. We can have conference calls for people in many locations and several continents without leaving our desks. People have turned increasingly to webinars and web searching to find new things and answers. We follow blogs to read opinions and discussions to add to and enjoy.
If we go to a meeting, we are expecting something else. We want to find community. We want to build relationships, which can then be maintained on the Web once they are established. We want to have discussions. We want to help build, brainstorm, learn, and develop in a group setting. We want to make a deal, discuss the terms, and build trust, face to face. Teaching new skills, reading thought pieces, and announcements can now be done in a web-enabled environment.
Selling (Prostitution) of the speaking slots by the real vendors, those who put on the shows, has had a deleterious effect on the quality of the meetings. The costs have reached a tipping point where they no longer provide a good return on investment for attendee or exhibitor. It is no longer useful to have a big party for your users or to set up a user group meeting in conjunction with one of the big national meetings. But more than that, the challenge remains on how to engage the attendee. How can they be part of the meeting rather than a passive audience? How do you get a sense of community?
There are several budding online communities, which seem to be flourishing. Taxonomy Community of practice is one; the Taxonomy Division of SLA is another. The ones on LinkedIn and Facebook have not yet taken off. The rest are in user groups. Access Innovation’s Data Harmony User Group meeting will be held in Albuquerque February 7-9, 2012.
Come join the community!
Marjorie M.K. Hlava
President, Access Innovations
Back To School
If you are one of those people who love to learn something new all the time or maybe someone who enjoys going to school as an adult professional, this might be right up your alley. Free university classes – yes, I said free. From Stanford School of Engineering, an experiment in distributed education that offers “Machine Learning” free and online to students worldwide during the fall of 2011.
Students will have access to lecture videos, lecture notes, receive regular feedback on progress, and receive answers to questions. When you successfully complete the class, you will also receive a statement of accomplishment. In addition, another course on “Databases” is being offered in the same format.
If you just want to learn more about this world of artificial intelligence, indexing and semantic language processing, this seems like a nice way to spend your spare time.
Melody K. Smith
Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.
National Libraries Adopt New Cataloging Code
The three U.S. national libraries — the Library of Congress (LC), the National Agricultural Library (NAL), and the National Library of Medicine (NLM) — have announced that they will adopt, with certain conditions, the Resource Description and Access (RDA) cataloging code.
The Library Journal brought this news to our attention in their article, “Cataloging Community Galvanized as U.S. National Libraries Move To Embrace RDA.” This was a bit of surprise considering the fiscal and technological concerns involved with training and integration.
Though complete implementation will not occur before January 1, 2013, the Library of Congress will begin partial RDA cataloging across all subject areas in November. “We believe that the long-term benefits of adopting RDA will be worth the short-term anxieties and costs,” said representatives of the three national institutions. “We must begin now. Indefinite delay in implementation simply means a delay in our effective relationships with the broader information community.”
Sharing seems to be the term of the decade as combining resources among libraries is happening fairly consistently.
Melody K. Smith
Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.
Chicago Conference Focuses on Resource Sharing
“Resource Sharing in the Digital Age” is the theme for the upcoming Interlending & Document Supply Conference, sponsored by IFLA and ALA. The event will take place September 19-21, 2011 in Chicago.
The conference sessions are focused on interlibrary loan practices and services, plus other types of resource sharing including cooperative collection development, consortial borrowing, purchase on demand, and shared print repositories. Other topics include copyright and licensing, ethics, and global implications of resource sharing.
Information seekers have unprecedented access to information. Just because the information is digital doesn’t reduce the expectation of availability. Meeting the unprecedented demands of a new generation of researchers requires an equally unprecedented commitment to resource sharing across borders, time zones and oceans.
For more information about this conference, check out their site.
Melody K. Smith
Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.
Libraries Working Together for Best Practices
Two university libraries have joined together to start a new open access, peer-reviewed journal dedicated to library-led scholarly communication initiatives, online publishing and digital projects.
The new project is named the Journal of Librarianship and Scholarly Communication, and will provide a focused forum for library practitioners to share ideas, strategies, research and explorations of library-led initiatives. Topics will include institutional repository and digital collection management, and library publishing/hosting services and authors’ rights advocacy efforts. The journal will be co-edited by Marisa Ramirez (Cal Poly) and Isaac Gilman (Pacific University).
Libraries are known for resource sharing, collection management, cataloging/metadata, instruction and public services. The journal aims to lift up best practices in all those areas and provide a shared intellectual space.
We found this news on Pacific University Oregon’s site in the release titled, “Pacific University Library and Cal Poly San Luis Obispo Library form partnership.”
Melody K. Smith
Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.
The Power Of Survey Taxonomies To Skew The Results The Way You Want Them
April 18, 2011
Posted in Access Insights, Featured, reference, semantic, Standards, Term lists
I went to the doctor’s office this week and they asked if I would participate in a short Federal survey. I said sure.
“What is your nationality?”
“American,” I said.
“That is not an option,” said the lady.
“What are the options?” I asked.
“Hispanic, Asian, Asian Black, African, Central American, Chicano, Cuban, Hispanic Latino, Mexican, Native American, Native Hawaiian, South American, Spanish, White, White Hispanic Other, Unknown and Refused,” she said.
“I am Native American, White, Spanish and Mexican,” I said.
“You can only pick one,” she said.
“I am also Welsh, English, Scottish, German, Dutch, Irish and married to a Bohemian Mexican, Spanish French man. Put me down as Refused!”
She said “Most people are putting down Refused or Other.”
I figured I was at the doctor’s office, many groups have known medical predispositions to diseases. That must be why they are asking. Medical predispositions of some sort (whether susceptibility to certain diseases or response to certain drugs) might actually have been why they were asking; at least it’s quite plausible. Of course, there’s still a problem with the lack of granularity, whether they’re doing research or predicting risk.
One example is that ingestion of fava beans which may be fatal for some people of Mediterranean descent. I’ve heard anecdotes about a U.S. Army cook serving up meals with fava beans, and the infirmary subsequently dealing with an influx of very sick people.
I don’t have a reference for the latest version of the OMB Directive (still the 1997 one) and came across the FDA’s “Guidance for Industry: Collection of Race and Ethnicity Data in Clinical Trials“, which says, in part:
“Differences in response to medical products have already been observed in racially and ethnically distinct subgroups of the U.S. population. …For example, in the United States, Whites are more likely than persons of Asian and African heritage to have abnormally low levels of an important enzyme (CYP2D6) that metabolizes drugs belonging to a variety of therapeutic areas, such as antidepressants, antipsychotics, and beta blockers (XIE 2001). Other studies have shown that Blacks respond poorly to several classes of antihypertensive agents (beta blockers and angiotensin converting enzyme (ACE inhibitors) (Exner 2001 and Yancy 2001). …Clinical trials have demonstrated lower responses to interferon-alpha used in the treatment of hepatitis C among Blacks when compared with other racial subgroups.”
Ashkenazi Jews are known to be especially vulnerable to certain diseases, e.g. breast cancer. And from the American Association of Cancer Research Journal “62% of the Taiwanese colorectal tumor specimens analyzed exhibited Eps8 over expression.”
Those would indicate excellent reasons to do this survey. Nope! This classification does not justify those. The groups were incredibly unbalanced. All of Asian, Chinese, Korean, Indian, Malay etc are in a single class – half the works population under a single classification. ”African” Africa is a huge continent. There are many phenotypes there and all are grouped into a single lump. White, not German, Scandinavian, English, French, plus most Spanish and Portuguese are Caucasian in origin as well.
More background. Many have tried to classify mankind. Bodin’s color classifications in the mid 1500′s were descriptive using neutral terms based on skin color such as “duskish colour, like roasted quinze, black, chestnut, and farish white.”
By the 1600′s Bernier settled on four subgroups based on the four quarters of the globe and used Europeans, Far Easterners, Negroes (blacks), and Lapps.
In the 1800′s Louis Agassiz made a case for genre of scientific racism based on creationism and gained wide followings. We have Arthur de Gobineau to thank for the theory of the superior races and the Aryan race. He saw the intermingling of races – like French marrying Germans as a degenerative process. Thomas Huxley and Charles Darwin were believers in monogenism (all humans descended from one evolutionary process). Huxley separated mankind into 9 types – four of them on the African continent, and three types of Mongoloid. Darwin argued that they were all one speicies and in the Descent of Man, chapter VII argues that all “should be classed as a single species or race, or as two (Virey), as three (Jacquinot), as four (Kant), five (Blumenbach), six (Buffon), seven (Hunter), eight (Agassiz), eleven (Pickering), fifteen (Bory St. Vincent), sixteen (Desmoulins), twenty-two (Morton), sixty (Crawfurd), or as sixty-three, according to Burke. This diversity of judgment does not prove that the races ought not to be ranked as species, but it shews that they graduate into each other, and that it is hardly possible to discover clear distinctive characters between them.”
In the later 19th and 20 centuries there were a lot of mental excursions into classifications based on intelligence, skull shape, etc. By the 1930′s people had stopped trying to do these types of classifications and the rise of the Nazi’s underscored how damaging such classifications can be leading to ethnic cleansing by superior.
In 1954 UNESCO condemned all approaches to classification by race saying that we should not make examples of the Caucasian, Negroid and Mongoloid races but rather talk about ethnic groups which share common cultural ties.
So what is the government doing? Recent news articles have heralded a 40% level of Hispanics in the US. Is that true? Do I have to be only one classification? How reliable are surveys where of the 28 classifications available 8 could be roughly grouped as Hispanic (what happened to Iberian?). Aren’t the Spanish a combination of Moors and Celts? Why do we try to do this?
An interesting way to trace our thinking is to follow the US Census categories. In the 1790 Census the count was made on White Males, White Females, other free persons, and Slaves (all types). In 1940 Mexican was counted as white. In 2010 the census allows for an entire question on Hispanic origin including Argentinean, Salvadoran, etc., and an additional 15 categories for Race. Wikipedia itself has 35 entires for race and ethnicity. Seven of those are Hispanic and an additional one for Non Hispanic whites.
The American Anthropological Association made recommendations for the classifications for 2010 but they were not accepted by the Census Bureau. There is still no American for those of us who do not fit into one or even two classifications. Let’s see 8 out of 18 classifications is … 44.5 percent. The news says that the Hispanics are 40% of the population. I wonder what the Irish are. If we had a classification for Central Europeans would they be a bigger part of the population?
This shows the power of the classification system in surveys. If you want to get a certain answer then you make that percentage of questions or answer options the percentage you hope for. How many Chinese Americans? They are under Asian. How many people from India? Look under Asian. Japanese, Filipino, Thai, Vietnamese, guess where to look. All are classed together. Want to know how many Arabs? Tough!
What if we were to let people put in their own classification what would the answer be? The 1980 and 1990 censuses came close to that option. But they did not allow multiple posting. You could be either Black or White. If you said White/Black you were classed as White, if you put Black/White you were coded as Black. I do understand that the big mainframe computers of that age had fixed length fields and coded options with limited sort options. But those days are long past. Now we handle variable length fields of text, multiple subfields, we can sort and aggregate information in many ways.
What would I do?
- Work from the data. People got really annoyed with the census. Some refused to answer at all. The options were not ones they felt comfortable with. I would let people put in their own assessment. That would give a realistic assessment of what people prefer to think of themselves as. What if we decide to collect the information to see how diverse we really are? Actually we do not have this data, but we should try to collect it. It is not too early to decide on what to collect for the 2020 Census. Perhaps by then we can see …20/20.
- Ensure that the balance of the surveys is truly reflective of the data group. Do not bias it by the questions asked. If 44.5% of the answers provide a single grouping, is that really a fair survey? I would not allow surveys that try to cram everyone into a single class (multiple broader terms should be allowed). I would allow as many listings (race or ethnicity) as people want to take the time to put in. We are a melting pot of a country. We used to be proud of it. Now we try to segment and separate which drives wedges and divides.
- Provide associations. If you let people do their own classification, allowing free associations, then the results would provide linkages the creator of a survey instrument could not foresee. The richness of related terms in the thesaurus or links in semantic web are a bonus in richness of expression.
- Make a hierarchy. Are all those classifications equal or are some subdivisions of others? Could someone choose a higher level because they are Cuban and Latino? Some want to be grouped as Hispanic? It does not have to be a single flat list. Let people decide how discretely they want to be classified. That would tell us a lot about the nation. This step takes a lot of care; it’s is where an unscrupulous or careless group would have power to really slant the survey by the way it organizes the hierarchy.
- Does it matter what the group calls itself? There are shorthand ways of describing every ethnic group and race. Can we allow them to use those names and translate them into officialdom? I think that would make the results a better source of information about the groups themselves. Someone could decide on the preferred term usage, but not at the data collection level. That would interfere with the real data collection.
Summary
If the census and other surveys were built on controlled vocabulary principles, then there would be Associate, Equivalence and Hierarchical options. Working from the data instead of imposing a preferred order on the subjects would give a significantly enhanced data set. In this digital age, we should be able to do much better. We are no longer bound by old style mainframe computing or tallying all results by hand. Let’s catch the census and other surveys up to the current information standard practice.
Marjorie M.K. Hlava
President, Access Innovations
New Grant Funds Digital Research Center
April 18, 2011
Posted in News, reference, Uncategorized
April 18, 2011 – Emory University is soon to have a coffeehouse-like environment in which students can work in cross-discipline collaboration and allow researcher access to the library’s vast experts on all things library. This will differ from other digital research centers because of the cross-discipline collaboration potential.
Per The Associated Press’s article, “Emory University in Atlanta gets grant to open digital research center for faculty, students,” Emory University has received a $695,000 grant to create a digital center where students and faculty in the humanities can mix and mingle on research projects. The two-year grant comes from the Andrew W. Mellon Foundation.
It will be interesting to see the results that come out of this new project and workspace over the following years. We already know the difference environment can make in learning potential.
Melody K. Smith
Founder of Lucid to be Keynote Speaker at Lucene Revolution
April 8, 2011
Posted in News, reference, Uncategorized
April 8, 2011 – Marc Krellenstein, founder of Lucid Imagination, is set to address “The Once and Future History of Enterprise Search and Open Source,” at Lucene Revolution 2011.
Other speakers scheduled for this conference, dedicated to open source search, include Tyler Tate, Head of User Experience at TwigKit; Joshua Tuberville, Search Architect at eHarmony; and Floyd Morgan, Principal Software Engineer at Intuit.
This wide range of speakers includes the foremost experts on open source search technology. The two-day conference agenda is packed with technical sessions, developer content, user case studies, panels, and networking opportunities. For a look at the full agenda and to register, visit their website.
Melody K. Smith
Wine Application Also a Great Teaching Tool
April 4, 2011 – Another piece of software using semantic technology to help our dining experience be just a little bit better. I don’t know about you, but this gets my attention.
We found this interesting piece of news on Teatro International in their article, “How to use web semantic language to choose wine.” Web scientist and Rensselaer Polytechnic Institute Tetherless World Research Constellation Professor Deborah McGuinness isn’t new to developing applications for the wine enthusiasts. She has been doing it before the Internet was normal nomenclature. McGuinness is now considered one of the world’s experts in Web ontology languages.
The most recent wine application is an exceptional tool for teaching future web scientists about ontologies.
“The wine agent came about because I had to demonstrate the new technology that I was developing,” McGuinness said. “I had sophisticated applications that used cutting-edge artificial intelligence technology in domains, such as telecommunications equipment, that were difficult for anyone other than well-trained engineers to understand.”
Using semantic technology, the online sommelier is input with basic background knowledge about wine and food. The semantic technologies beneath the application then encode that knowledge and apply reasoning to search and share that information. The application can also be used to make personal wine suggestions, manage a personal wine cellar, and link with the inventory of the local liquor store or wine shop.
In my humble opinion, this is a good use of semantic technology. But then again, I might be just a little subjective.
Melody K. Smith
Taxonomy Information Sources
February 21, 2011
Posted in Access Insights, Featured, reference, Taxonomy
I am often asked, “Where can I find out more about software and taxonomies to reuse?” There are a number of attempts to organize taxonomy resources on the web. Some are more up to date than others.
Software information
The old standby was maintained by Leonard Will in the UK. He builds thesauri for a living and has a great section on software offerings at this URL – http://www.willpowerinfo.co.uk/thessoft.htm.
This site contains information about what thesaurus building resources were available and where to get them and a bit of information about how they worked. The Willpower site is still very useful. I have not found this information elsewhere online; I guess one of us should do that. Anyone want to work on it together?
In the past the Wills were responsive whenever I sent a change they made it. The Willpower site says it was last modified 2010-08-10, but, here are a few updates.
- The Data Harmony Thesaurus Master information is from 2002. So go to www.dataharmony.com instead for that information.
- The current American Society for Indexing thesaurus software listing.
- The Thesaurus Management Software page was last updated 2003 and has some interesting leads to follow in spite of a lot of broken links.
- http://www.wandinc.com/prod_tools.aspx lists only the companies that they have a business arrangement with.
Reusable taxonomy information
A second venerable source is the Taxonomy Warehouse, built by Synaptica and later purchased by Factiva part of the Dow Jones company. I noticed that it has been heavily populated of late by two other sources of taxonomies for sale through Taxonomy warehouse. They carry the approximately 35 taxonomies listed from the Wand Company and 72 or so from Cengage Learning, plus other resources. Both of these have taken large taxonomies and split them into smaller branches for sale and re-purposing by customers.
Other sources are not as well known. There is the CALL – Center for Army Lessons Learned - that lists a few government thesauri. Tesauro lists about 100 Spanish language thesauri. TaxoBank lists about 200 so far and encourages people to submit their works either for sale or for free distribution. It is a social site enabling the community to add and find out about controlled term lists of all kinds, taxonomies, thesauri, authority files, etc. and to share or purchase them directly.
Of course, there are the standalone word resources, dictionaries and synonymies themselves such as http://thesaurus.com/ which switches faces to the same content using different URLs. One site, many URLs can reach it, for example, http://dictionary.reference.com/ is different colors but reachable using either URL.
Roget’s thesaurus is now online in many places. The words are the same but the presentation and searchability are different.
- http://www.yourdictionary.com is a notable source.
- Yahoo! Education References allows you to search both by category and alphabetically.
- Bartleby.com – Here also you can search by category or alphabetically.
- Project Gutenberg – From this site you can download the 1991 release of Roget’s Thesaurus for free.
- The ARTFL Project - The 1911 version of Roget’s Thesaurus is searchable fun.
Here is a site to AVOID. http://dictionary-and-thesaurus.net/?s=kp06p1625aqnhbm1o8h415mm60 has a really annoying pop-up asking for your cell phone number before you can proceed to look anything up.
No list would be complete without mention of WordNet from Princeton University, which is widely used as a lexical dictionary in semantic systems. It shows the gradient of words and meanings in a continuum.
What all of this really shows is that for people who spend their time organizing information, their tools are not well organized or discussed. I imagine it is because the vendors like me want to be fair and balanced; they are the most knowledgeable and have seen the most options. They do, however, need to make a living and may have biases, even if only subconsciously, to their own services. I do not think there are any writers or listers who are not vested in some way. University faculty are looking for research funding. However, Marcia Zeng of Kent State University, creates excellent and unbiased resource lists. Taxonomy Community of Practice is run by Seth Earley who uses it effectively to promote his services and software. Meeting organizers like Information Today, sponsor of Taxonomy BootCamp meetings, puts many consultants on the podium.
Perhaps the association sponsored groups like the SLA Taxonomy Division, composed of over 200 taxonomy practitioners, and the American Society for Indexing Taxonomy SIG, may be the most balanced opportunities around. The ASI one is convened by Heather Heddon, who wrote a book on taxonomies and builds taxonomies. ASIS&T (the American Society for Information Science and Technology) has lately sponsored a series of webinars on taxonomies; those webinars have been very well received. First, Joseph Busch of Project Performance Corporation (they build taxonomies) and later Marjorie Hlava of Access Innovations (they build and implement taxonomies) were instrumental in developing the series.
Is there a trend here? You bet!
The take away? If you want the real story, ask several people who do this work for a living. They work where the rubber meets the road.
Marjorie M.K. Hlava
President, Access Innovations
Related articles: