AIDC Code of Conduct

Our Mission:

At AIDC we're building a platform that empowers data engineers to monetize their  datasets, fueling the innovation economy for data in AI while maintaining the highest standards of ethics and integrity. This Code of Conduct guides our community to ensure a safe, respectful, and productive environment for everyone. We expect everyone to do the right thing, but sometimes it's helpful to be explicit about what that is.

Post what is yours

You must have rights to what you post.  We help you create license agreements, but if your content has copyright material, trademarks, patents or other intellectual property rights, then you must cite the releases for that content.  You can mark your dataset as having this kind of content as you work through the posting process to make it clear.  

Similarly, if you are posting content that previously had a public license, and you are now posting that content for purchase, you must make sure that the public license allows you to modify the dataset and then use that dataset for commercial purposes.  Many do allow such use as long as you cite the source material.  We will have blog posts and webinars discussing these details.

If you have accumulated data through web crawling or other acquisition processes, we ask that you respect posted policies in website robots.txt files and do not crawl notification sites.  We respect open web and public data sites, but we also respect those who do not wish their web data to be harvested.  That is one reason we created this site.  Respect the privacy wishes of data posted on the internet and comply with copyright and use notices.

Keep the data clean and above board

Datasets which contain pornography, hate speech, facilitate criminal activity and gambling are prohibited from this site.  We scan autonomously for this content and datasets will be rejected and you will be suspended from and potentially banned from the site.  We will investigate when we find issues and will communicate with you.  

We also scan datasets to make sure that they are the format that you describe and that they are completely filled with the data that you have described.  We don’t want to see datasets that are half full.  Be responsible and deliver datasets that are valuable and that Purchasers see as being useful.  If we don’t catch it, then others who try to use the datasets will catch it and you won’t last long on the platform.  We want the site to have diverse, but high-quality datasets, so please only load complete, useful datasets.

We take data privacy seriously

Generally speaking, you want to make sure your datasets do not have any personally identifiable information (PII) such as names, phone numbers, addresses, etc., and we check for that.  When you load datasets you can identify whether your datasets do have PII data and are GDPR/CCPA compliant. Depending on the use case and how the dataset is licensed, you may want to anonymize that dataset to make it easier to sell for AI training purposes.  For large unstructured datasets, say a domain specific dataset discussing a particular topic, an anonymized dataset could be quite valuable.  

Similarly with HIPAA data, there are specific approaches to anonymizing data that can be applied.  We allow you to identify these approaches, and you can describe these details to prospective buyers, and execute the appropriate agreements as needed.  The goal is to allow the datasets to be available, but with the right privacy controls in place.

Make sure that your dataset is what you say it is and everyone is treating the privacy of those involved properly.  Our focus here is on AI training. 

No malicious or intentionally misleading data

All datasets must be free from malware, malicious content or security threats.  Datasets cannot contain any intentionally misleading data, misinformation, deepfakes, or other deceptive content that can intentionally inhibit the effective training of an AI system. 

Communication

Often a Purchaser will have a question about your dataset.  We will connect them to you and we expect a direct and professional response.  We also expect you to come back to the platform to complete the transaction, because that is where you both get a fully executed contract and secure payment and dataset transfer.  We are focused on creating a fluid market so keeping communications flowing is important for both sides.

Enforcement

We take this Code of Conduct seriously. Violations may result in warnings, temporary suspension, or permanent removal from the platform, depending on the severity of the infraction. We reserve the right to remove any dataset or user account that we deem to be in violation of these guidelines.  This is consistent with the contract you signed to come onto the site.

If you have any concerns about another Member or Purchaser or have concerns about any dataset that you have seen posted, please contact us here
 

Updates

This Code of Conduct will be updated continuously.  Updates will be posted here  

Our Commitment

We are committed to fostering a vibrant and ethical community of data engineers who are shaping the future of AI. By adhering to this Code of Conduct, we can ensure that our platform remains a trusted resource for high-quality datasets and a catalyst for responsible innovation.  

Let's build the future of AI, together, the right way.