Community Leadership Summit Wiki

Building Community around Open Source vs. Open Data

180pages on
this wiki
Add New Page
Comments0 Share

host: Michael Burnstein

notes: Andrew Davis

similarities and differences b/w open data communities and open source communities

  • open data communities have more internal debate over licensing
  • the field is new and there has not been enough litigation for businesses to guage risk
  • Copyright restrictions are weightier for data vs software
    • in the US a database (of copyrightable facts) cannot itself be copyrighted -- aggregation isn't "creative"
    • in Europe aggregation is considered to be a creative task (this has hindered open data startups from starting in Europe)
    • in the US you can extract the facts from a dataset and not be liable for copyright infringement -- The copyright situation in europe is more tenuous
    • open data communities are difficult to start because rules are not the same all over the world
    • restrictions require that those joining/using data startups to be known personally to insure that they are not litigous
    • open street maps had to abandon a CC viral license model

How do we incentiveize contribution to Open Data?

  • people will put their data into the public domain to contribute to a specific cause
  • You don't have to appeal to "what's right" or "social good"
    • bringing business to open data requires showing business benefits
    • Opening research data
      • peer review in the public space is one way of convincing academia
      • getting rid of database access fees is something academics are highly interested in
      • scandals in falsafied data has brought public knowledge of open data to the netherlands (stoppel, stannel (sp?))

"don't try to evangelize non-geeks about the benefit of open data because they don't care"

Read vs. Write Access in Open Data (access to data vs. contributing to data)*how do you verify public contribution

    • community verification (10 people agree, so it's probably right)
    • trusted users
    • community users can crowsource coverage of data verification
    • don't allow public access
    • multiple repositories can be used to verify each other
    • Closed Source code can restrict duplication and insures quality demands
    • humans must be involved, you can't autiomate all verification
  • What happens when bots target a dataset for corruption?
    • time thresholds are often used to prevent bot corruption

Public vs. Private Data*you must track the source of all data

  • medical data has certain fields that can never be shared
  • lawyer wiki is completely closed to allow for open discussion
  • is it enough to close or denormalize some data to the public to maintain privacy?
  • allow users to express their level of consent with clear wording


  • Tri-Met routing built upon open street map
    • tri met is responsible for route and timetable accuracy
    • tri met is not responsible for map accuracy
    • tri-met verifies the open street map data every night and submits corrections back to the community
  • First Monday
  • AOL anonymized search query data was quickly de-anonymized

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Also on Fandom

Random Wiki