-
Notifications
You must be signed in to change notification settings - Fork 72
[review this in October 2025] Add: data section to guide #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope you don't mind me jumping in on the draft @NickleDave, I was linked here from a PR. Is there something you feel is needed to get this data section live? It looks overall really solid as is, and if it were merged it can still be enhanced, by making new sprint tickets.
then you would likely put it inside your source code | ||
so that it will be included in the sdist and wheel. | ||
If the data is meant only for tests, | ||
and you have a separate test directory (as we suggest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incomplete sentence.
you probably want to be aware that such tools exist if you are reading this section. | ||
Such tools could be particularly important if your package focuses mainly on providing access to datasets. | ||
Within science, tools have been developed to provide distributed access to datasets. These tools | ||
general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incomplete sentence
|
||
## How to access your data | ||
|
||
Last but definitely not least, it's important to understand how you *and* your users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incomplete sentence
https://github.com/fatiando/pooch | ||
code snippet example of using pooch | ||
|
||
### For tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the impression is is much less useful to scientific programming, but do we want to mention that the data could instead be fully generated, with something like faker or hypothesis, and no data files need to be checked in at all? At least for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ucodery, this has been open for so long! Essentially, David wrote a draft years ago, and we were talking about it in Slack. At some point, I think I suggested to David that we put it online and let folks add to it. I think some parts of it are still in bullet format.
We could merge, OR we could also add to the content by suggesting inline edits to sections that are in bullet format.
I'm super open to an approach, or if you want me to decide a strategy, I would lean into - let's all work on sections and get it to a point where we can merge it. Just a thought!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with either approach. But I think unless we actively bring up the conversation again on slack, or make this part of a documentation sprint, there won't be much if any added content.
I think the best thing is to get this PR pushed as soon as we can while still looking professional and not including any incorrect information. We can still discuss or sprint on it after a push too. My thinking is that having some information up online is likely to attract more contribution than a long-standing PR. People online are more likely to come to us with corrections or additional parts if they find it on our guide, and probably aren't idly looking through our github.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool. I just updated the branch, so our preview builds. I'll have a look at it, and then let's plan on a sprint of some kind. MAYBE we start with a hackmd sprint to get it over the finish line to "good enough" to merge? IF i pull this down, create a new branch and repush i can link it to our hackmd for easy editing on that branch. Let me give this a try and we can go from there!
This is still very much a work in progress! Just adding what I have so far from the last two writing sprints