New Contributor

Mar 16, 2014 at 10:32 PM
I am interested in contributing. Where do I start?

Simi Talkar
Mar 17, 2014 at 5:47 PM
Hi Simi! And welcome to the party!!

So I talked with Jeremiah and based on that conversation I had a suggestion on where you could help out.

I would suggest that something you can do to help you get started at your own pace is working on helping us with a really messy problem we have. If you head over to here you can see all the repositories that we need to build to make Thali happen. As you can see, it's a really long list. And it's about to get longer because CouchBase just released their Java bindings and that will add another two repositories.

Right now we do all of our builds manually and that is an error prone process guaranteed to produce problems. What we need is a fully automated build. But this isn't trivial because we need to do things like start VMs to run Android, run tests, deploy to Maven, etc.

So what we are thinking is to use Jenkins to create a continuous build environment. Ideally it would 'trip' whenever we updated our main repositories and do a full build and test run and if everything passes then update our maven releases and our apk/executable jar releases. That way our process would be documented, repeatable and fully automated.

Since we don't have the time or resources to set this up right now we aren't blocked on it. So if you decide to work on this you could do so at your own pace and not worry that anyone is sitting there waiting for you.

I realize this isn't the most fun project in the world but if your goal is to get immersed in the latest open source infrastructure (which is the basis for doing any code development these days) then this is a great way to do it!

If this does sound like something you want to take on then I would make the following recommendations.
  1. Go read our wiki. Yes, there is A LOT of information there. NO, we don't expect you to understand or master all of it. The purpose of reading the wiki is to wrap your head around what we are doing at a high level and get a 'sense' for what we are up to.
  2. Go read your favorite resource on Git. This is the foundation of everything we do. Yes, we have a page on Git but honestly that is intended more as notes and not as an introduction. There is a free book on Git, the first 3 chapters cover the basics. But there are a lot of intro guides to Git and I encourage you to look around to find one that feels readable to you. Git is really the core of everything so you need to understand it. To be fair, it's not easy. I'm sorry. Git is insanely powerful but it is also equally complex. But if you don't understand the basics of how it works you will end up making a mess. I wish there was a less painful way to get into Git, but I haven't found it yet.
  3. Go read up on Maven. You really and truly don't need to understand too much about Maven. Just the difference between 'local' Maven and 'remote' Maven. We just use Maven for serving up libraries (e.g. arrs and jars). We don't use it for anything else.
  4. Go read up on Gradle. There is a lot of Gradle Documentation available for free. Gradle has, I must admit, been the bane of my existence. I once calculated that 25% of the time I've spent on this project has been dealing with Gradle issues. The problem is that Gradle is fast moving and the Android Gradle plugin in particular is buggy as hell. But you don't need to be a Gradle deity to achieve your goals here. You just need to understand the basics of how it works.
  5. Go install IntelliJ Community Edition. This is our core IDE. BTW, I think this automatically installs the Android SDK but if it doesn't you'll need to find and install that too. For a complete list of software we are using see here.
  6. Go and try to actually follow our build instructions (and remember to sync the depots we link to from GitHub and CodePlex, you will have learned how when you read about Git)
  7. Now it's time to run all of our tests. We'll talk more about how to set that up and run it later.
  8. Then it's time to read up on Jenkins and understand how to program it.
  9. Then it's time to write the Jenkins scripts to create our build and test environment and run actual builds. We'll need both a local and remote set up.
  10. Then finally we will set up a Bitnami image in our Azure account to run Jenkins and will install the scripts there
See? Easy!

O.k. now, if you are anything like me, you are looking at the above list in slack jawed disbelief wondering if I'm either crazy or a sadist. We can argue crazy, but I'm not a sadist. Rather (and I encourage you to show this email to Jeremiah) the above is actually the tools that a modern open source software developer uses every day.

Now, you have to be wondering - can anyone really keep all that in their head? Of course not! The trick is to learn how to skim not dive deep. You don't need to become a Git guru, a Maven maven, a Jenkins expert, etc. What you do need to do is understand what your goals are and figure out how to get in, find exactly what you need and get out, fast.

So here is my suggestions for how to tackle the above.
  1. Skim. Go through each item and find the Wikipedia page for the technology. Your goal is just to understand at a very high level (no details) what the heck the technology actually does. So if someone says "Git" you will immediately understand that they are talking about a version control system that runs in a distributed manner. You don't know the details yet. You don't care. You just want to have a fuzzy picture of the land.
  2. Install Intellij Community Edition. This should be easy and will set you up for the next step.
  3. Really do read the first 3 chapters (or equivalent) of the Git book I linked to above. You do need to understand 'how' Git works (even if you don't know all the details). Without that knowledge you will constantly run into issues.
  4. Now clone Thali's depot on codeplex (if you read the first chapters in the previous step the previous sentence will now make sense to you) and focus in specifically on the Universal Utilities project. What's nice about that project is that it's vanilla Java (so none of the Android complexity comes into it) and it doesn't have tests you can run directly (I'll explain later why). Now just build that via Gradle. Even a very thin introduction to Gradle will teach you enough to do the build. It's a simple command.
  5. Now install a local Maven server (easy, I promise) and this time do a local install via Gradle into local Maven of the JAR file. If you read a basic introduction to Maven you will know more or less what this means. You will need to go search out our build instructions that tell you how to do this (linked above). Expect problems since I'm sure our instructions suck. So send in issues! Remember, every problem you have, everyone else is going to have. So when we update our instructions you are helping everyone.
  6. Now you are ready to start learning about Jenkins. Again, skim. And install a local Jenkins server (I've never done this so I have no clue how hard that is) and try to get Jenkins to do the build. This literally means getting Jenkins to run a single Gradle command.
This is a single, simple, pass of the general pattern we are going to build. It has all the major parts but at a vastly reduced level of complexity. As you go through these steps make sure to send mails on this list with issues. If you make it to the end of the simple list above then we can continue more iterations that incrementally add more complexity until we finally complete the full list at the start of this mail.

The point is not to try and climb Everest the first day. Rather first we'll climb a tiny hill, then a slightly bigger hill and so on iteratively until we finally get everything we need and climb the biggest hill of all.

If after reading this you still want to play then let us know and start on the simple list and send in questions, issues, problems, etc. We'll try our best to help!

Mar 17, 2014 at 7:17 PM
A lot of deep wisdom and solid practical advice is packed into that post, Yaron! Plus, it's a model for how to welcome someone to a project.
Mar 17, 2014 at 8:38 PM
Wow! I second Jon's comments.
Mar 18, 2014 at 3:13 AM
Thank you for the detailed work up of where I can start (and proceed). I am so glad you are letting me do this at my own pace and I am also extremely grateful that you are helping me along.

Thanks so much,

Simi Talkar

Mar 18, 2014 at 5:36 PM
My pleasure! Just make sure to ask questions!
Mar 19, 2014 at 12:52 AM
I will. Familiarizing myself with GIT and downloading IntelliJ, Android SDK...

Simi Talkar

Mar 20, 2014 at 5:22 PM
I have been playing around with IntelliJ and have made a simple Android app - displaying Hello World on the screen.

1) I am able to preview it when viewing the main.xml file
2)I have configured the Run to run this application on an AVD - a Nexus

When I run the application I realize it is extremely slow and a connection is made to the Emulator and then I get :

Waiting for device.
Target device: AVD_for_Galaxy_Nexus [emulator-5554]
Uploading file
local path: C:\Users\Simi\IdeaProjects\HellloAndroid\out\production\HellloAndroid\HellloAndroid.apk
remote path: /data/local/tmp/course.examples.HellloAndroid
I/O Error: EOF

Is this because the process timing out?

Mar 20, 2014 at 5:41 PM
Hi Simi,

Life is too short to wait for the built-in emulator. It is incredibly slow.

Here's one that works much better.

Let us know if those instructions are adequate, or if not what's needed to improve them.
Mar 20, 2014 at 6:31 PM
Thanks, I try it out.


Mar 21, 2014 at 4:44 PM
BTW, Simi, it occurred to me that there is an even simpler task you should probably start off with that will clarify a bunch of issues that have to be clarified for the ultimate task to happen. So this new task puts you firmly on the road to the grander goal.

Scenario - After a long day of hacking across the 9 depos we use the Thali developer wants to know which depos are in synch with origin and which are not.

To solve this problem you have to do a few things:
  1. You have to decide what the structure should be for the depos. E.g. each depot is a directory. Are all those directories required to be siblings in the file system? Probably should be. Do you decide that all the directories inside a specific parent directory should be checked? Probably, that way you don't have to hard code any names. Or, put more clearly, you can require that there be a single directory whose contents are nothing but directories, each of those child directories containing a separate depot (remember, per above, we have 9 of them).
  2. You have to figure out the right Git commands to compare the state of the local depot with its origin.
  3. What language should you write this program in? Certainly the traditional choice for something this simple would be bash. But if you are going to use bash then SKIM!!! Bash does everything and you can spend a long time learning it. I would instead just do targeted deep dives. Honestly the commands you need probably already exists including the ability to enumerate the directories and apply the git command, check the response and display the result. But don't feel bad if you want to use Java or C#. The only problem there is that calling out to the command line from those environments sucks. That's why bash is so perfect for this. Command line commands is what bash exists for.
Anyway just an idea and a program I desperately wish I had yesterday!
Mar 21, 2014 at 8:48 PM
I am getting more familiar with git and its commands so I will work on your suggestions...

Mar 26, 2014 at 6:24 PM

I am trying to reconcile my understanding about GIT and the questions you posed (see email below):

You have to decide what the structure should be for the depos. Are all those directories required to be siblings in the file system? Probably should be.

Do you decide that all the directories inside a specific parent directory should be checked? Probably, that way you don't have to hard code any names.

My understanding :

A Depo – is a remote site working on the project that has previously cloned from the main repository.

To keep things in sync with the origin, do you decide that all the directories inside a specific parent directory should be checked? Probably, that way you don't have to hard code any names.

My Understanding:

Are you thinking of an Integration manager that runs a check of what differences exists between the main repository and the various remotes? The git log command with various options shows the difference between the various commits.

The Git documentation talks about people performing Git fetch into their own local area, then merging their work and then pushing (or requesting a pull) their committed and merged work back to the main repository.

Typical steps for Integration workflow:

  1. The project maintainer pushes to their public repository.
  2. A contributor clones that repository and makes changes.
  3. The contributor pushes to their own public copy.
  4. The contributor sends the maintainer an e-mail asking them to pull changes.
  5. The maintainer adds the contributor’s repo as a remote and merges locally.
  6. The maintainer pushes merged changes to the main repository.

If there are people collaborating on the project who do not have Write access to the main repository:

1) They can add each other’s public repository as remote

2) Fetch branches they want to merge into their local repository

3) Send a request to pull all the committed and merged worked

One thing I am not clear about is, by the directories and sub-directories- do you mean checking for sync in the various branches in the depos versus the origin they cloned from?

As you can see I have quite a few questions about the setup of the main repository and the relation to the depos and between the depos.

Simi Talkar

Mar 26, 2014 at 7:44 PM
These are awesome questions!!! They show you are following the thread. It also shows that I haven't provided enough information. It's the assumptions that always get you (and reading the above I see I didn't make my assumptions clear). But you are on the right track. So let's get you further along on that track.

First, some terminology:

Depot - A Git repository. Nothing is implied about where this is. It could on your machine. It could be in CodePlex. It could be anywhere.

Cloned Depot - A Git clone of an origin depot.

Origin Depot - A depot that someone made a clone of.

So, for example, on my machine I have a local cloned depot of the origin depot yaronyg/couchbase-lite-java-native.

With that as background we can now get to the problem.

As a Thali Core Developer your machines has cloned depots of our 9 or so origin depots. You might fix a bug in your clone of our Ektorp depot or update something in your clone of our Couchbase Depot. Or make a change in your clone of our Thali depot. The day is coming to an end. What you want to know is - which of my 9 cloned depots have changes that I have not submitted to their origin depot? Similarly, which of the origin depots have changes that are not in my cloned depot?

The question we want to answer is - which, if any, of my cloned depots have changes that are either:
A) A change in the cloned depot that is not in the origin depot
B) A change in the origin depot that is not in the cloned depot

So what we want is a script that we can run that will say "Here is the list of depots that are not synch'd with their origin".

So this begs a few questions:
1) How do you know which depots to check? My suggestion was that we assume that all cloned depots are in the same directory. So when you run the script it assumes that the child directories of the directory it is run in are depots. That way the script doesn't need to know anything about the depots other than they are in the directory the script was executed from. So literally the script just enumerates the child directories of the directory it was launched from and walks through each of those children applying step 2.
2) How do we check each directory? There are standard Git commands for this. We probably need to do a 'git fetch origin' to see what's going on at the origin. Then a git status. Now when I run those commands I do so visually. E.g. I run them at the command line and look at the output with my eyes. For this script to be useful it has to get the output in some programmatic format so it can decide 'does situation A or B exist above'? See here for one example of how this could be done.
3) For each directory for which situation A or B applies the script would output a line saying "Depot [insert directory name here] has unsynch'd changes.

At that point the developer can decide what they want to do about it. Maybe they know "Oh yeah, I made some changes to Ektorp but they aren't ready to commit yet so I won't worry about it" or "Oh boy, all that work in Thali is ready to go, thanks for reminding me Mr. Script so I can go commit them".

Now there is (of course) one more complication. Branches.

In Thali we use the 'master' branch as the 'official' place to stick 'things'. But if someone knows they are working on something that isn't ready to go into master then they will create their own local branch with some random name. For right now, I don't want you to worry about that. You should assume that whatever branch a cloned depot is checked out in is the one you should check against the origin. But this does mean that you should check the branch names against each other. In other words, if the cloned depot is in a branch named 'foo' then you should compare origin/foo. My guess is that the Git commands will handle a lot of this automatically (git status certainly does). But you'll have to check.

Now, to be clear, this whole script will probably be something on the other of 10 lines of code. Probably less. This isn't anything fancy. It's just the first baby step on a long path of baby steps to our ultimate goal.

I hope this made things more clear rather than less so. But ask more questions and we'll find out!


Mar 26, 2014 at 8:22 PM
I'm just now getting around to building some of those cloned depots, specifically the Java libraries. says:

"Follow to configure the environment"

I don't have a account yet. Is it still possible to operate locally, such that libraries mentioned in are built in %HOME%.m2\repository, and then referenced from there?
Mar 26, 2014 at 9:08 PM
Related (possibly): what is the relationship between %HOME%/.gradle/caches and %HOME%/.m2/repository?
Mar 27, 2014 at 1:32 AM
While the questions Jon asks are relevant to Simi's project in the long term they are not relevant to her immediate task. I'm pointing this out just so things don't get confused.

Jon, you really asked two questions.

I would rephrase the first question as - how do I build to mavenLocal? See here for the answer to this question.

The second question, about the relationship of %HOME%/.gradle/caches and %HOME%/.m2/repository is that they are theoretically completely separate.

What happens is IF you are doing a gradle build and IF you have included the maven plugin and IF your repositories include mavenLocal (preferably as the first entry since gradle resolves dependencies in the order the repositories are listed) and IF you have a dependency THEN gradle will look in %HOME%/.m2/repository to find the dependency.

Now in general when gradle is looking for a file it will store it in %HOME%/.gradle/caches. Lots of stuff gets shoved there. But I've generally noticed that anytime I publish an update to mavenLocal I find that gradle grabs it. My guess is that since I always build to mavenLocal via gradle that gradle updates its cache via the maven plugin. But I don't know that for a fact and I've never tried to edit mavenLocal manually (e.g. outside of gradle) to see what would happen.
Apr 2, 2014 at 5:09 PM
So I ended up solving the immediate script problem partially because it was driving me nuts. All this script does is walk the directories in the directory it is called from and check if there is any uncommitted work. I use this before I do an update to artifactory to make sure I don't make a mess of things. All this script does is tell me which of my depots have uncommitted changes. It is not complete but it's a start.

for i in $(pwd)/*
  if [ -d $i ]
    cd $i
    if [ -n "$(git status --porcelain)" ]
      echo $i
Apr 2, 2014 at 11:31 PM
I also built a gradle script that builds Everything. See here for the script and here for instructions on using it. This will be a good starting point for our work with Jenkins.

What is unfortunate about this script (and has nothing to do with Jenkins) is that it doesn't check first to see if we have any uncommitted changes. It also depends on for things like artifactory_local which is messy. This means that you have to know to manually edit your and make sure to check if you have any uncommitted changes. That is an error prone process that is going to fail. But for now I am stopping on this work because there is something about how GradleBuild works that makes it ignore system properties I set programmatically in the build.gradle file in Production and running Git commands from Gradle is not immediately straight forward. There is an Exec task but remember that (at least on my Windows box) that requires accessing the Git bash shell.

There are work arounds for all of these problems. But now that the core build is automated on my box I'm going to move on to getting the WebView bridge running so we can hit our demo date!
Apr 3, 2014 at 12:26 AM
Thank you for the BASH script . I was still trying to figure out how to use Git command in BASH. Let me see if I can help with the Gradle issues.

I had set aside a scripts I found on StackOverflow:

To check which local branches are out of Sync with the remote.

Simi Talkar

Apr 3, 2014 at 2:21 AM
Note that the bash script will only run in Windows if you are using Git for Windows and running their bash shell. If you are using Linux or OS/X then it should run as is, I believe both their default shells support bash just fine.

But I like the link you have because that is a problem whose solution would save me a bunch of time. Right now I have to run "git fetch upstream" five times to check if any of the couchbase depots have been updated. This doesn't sound like a big deal and it isn't but the result is that I miss things. I also do it less often than I should because it's a pain. Having a script like the above that would just return the names of the directories whose upstream have unmerged changes would be really useful.

Also solving the limitations I identified above in the Gradle script would also be awesome.

Either item adds serious value so pick whichever you are more comfortable with.

For what it's worth Gradle has been the bane of my existence on this project. I find the docs to be painfully useless and the mailing lists and stack over flow content to be shallow at best. Every time I have to do anything with Gradle I end up frustrated and wishing I could use something else. So you might want to take on the bash script challenge first. You are likely to get more help from the Internet there.


P.S. The reason we are using Gradle is because Google mandated it as part of their latest Android releases.
Apr 3, 2014 at 11:20 AM
Edited Apr 3, 2014 at 1:03 PM
Suppose we had a Python script that did some of the needed work. According to

we could do
task runPython(type:Exec) {
   workingDir 'path_to_script'

   commandLine 'python', ''
The question then, I suppose, is how do you get Gradle to recognize and react to the output of such a script? (Without sacrificing a ram and decorating your forehead with its blood.)

I know I am going to regret this, but I found this example:
  classes = dir('build/classes')
  createTask('resources', dependsOn: classes) {
    // do something

  createTask('otherResources', dependsOn: classes) {
    if (classes.dir.isDirectory()) {
        println 'The class directory exists. I can operate'
    // do something
If the 'resources' task were a Python script that signaled success or failure by creating or deleting the 'classes' directory, then the 'otherResources' task could act accordingly. Presumably you could use a file as the same kind of signal. Python could iterate over your projects, run git on each, and at the end create or delete that special directory or file. And personally I would much rather use Python (or Ruby, or Perl) to parse the output of git.

But hang on. Others have been down this road before, what do they do?
"Programmatically" means never ever rely on porcelain commands.

Always rely on plumbing commands.

See also "Checking for a dirty index or untracked files with Git" for alternatives (like git status --porcelain)

You can take inspiration from the new "require_clean_work_tree function" which is written as we speak ;) (early October 2010)
require_clean_work_tree () {
    # Update the index
    git update-index -q --ignore-submodules --refresh

    # Disallow unstaged changes in the working tree
    if ! git diff-files --quiet --ignore-submodules --
        echo >&2 "cannot $1: you have unstaged changes."
        git diff-files --name-status -r --ignore-submodules -- >&2

    # Disallow uncommitted changes in the index
    if ! git diff-index --cached --quiet HEAD --ignore-submodules --
        echo >&2 "cannot $1: your index contains uncommitted changes."
        git diff-index --cached --name-status -r --ignore-submodules HEAD -- >&2

    if [ $err = 1 ]
        echo >&2 "Please commit or stash them."
        exit 1
Again you could wrap these git "plumbing" commands in your script language of preference. (Though in practice since the hard part is understanding the plumbing commands, if the best set of reusable components is done for bash, then bash it is.)

And again you could wrap Gradle around these scripts.

Probably have to sacrifice more than a few rams though.
Apr 3, 2014 at 5:12 PM
As I said, I think this is all solvable. I'm not sure why we would bring python into this (other than it is a much better language than bash shell scripts) but beyond that I basically agree. But since our count down to the demo is getting pretty low I'm going to ignore this for now.
Apr 3, 2014 at 5:25 PM
Agreed, just a placeholder.