Vetting Dependencies: Ensuring Software Maintainability

Table of Contents

Introduction

What is a dependency? In the software world, a dependency could mean many different things, but for this article, when I say dependency, I’m talking about a software package you install using package managers such as npm or pip.

In many software ecosystems, dependencies exist in almost every project. In the Python world, pip install makes it very easy to add a dependency to your project and get functionality for free. Python’s package ecosystem contains plenty of high quality, well maintained dependencies, but it also includes a lot of abandoned and lower quality ones. Adding new dependencies to a project affects its maintainability. For example, a new version of Python is released, and you want to upgrade your project. However, a dependency that you added a few months ago doesn’t support the new version. Now you must choose to drop the dependency or wait for it to support the new version and refrain from upgrading your project.

I frequently run into this scenario a lot professionally. Developers, including myself, too readily add dependencies without considering the long term consequences to a project. This article walks through how to properly vet a dependency before including it in your project. I speak specifically about Python, but these lessons apply to any language or software ecosystem.

Dependency Check List

  • Popularity
    • How many stars on GitHub?
    • How many downloads per month?
  • Update Frequency
    • What is the release cadence?
    • How quickly are issues resolved and are they tracked in the release notes?
  • Financial Backing
    • Does the project have funding?
    • Is the dependency backed by a legal entity?
  • Manpower
    • How many contributors does the project have?
    • Does the project have a governance structure or a core team?
  • Licensing
    • Is the license open source friendly?
    • Does the license allow forks?
  • Complexity
    • Could you implement the dependency yourself?
    • Could you maintain a fork of the dependency if it’s abandoned?
  • Code Quality
    • Does it have adequate test coverage and do the tests pass?
    • How many dependencies does it have and what are the quality of those dependencies?
  • Documentation
    • Does the project have comprehensive documentation?
    • Is the documentation kept up to date with the code?

Checklist Explanation

Popularity

Popularity indicates whether or not the dependency is worth adopting. Downloads per month and GitHub stars aren’t necessarily the best indicators of popularity, but these metrics are easy to find and good enough. Including a dependency that does not have a large adoption and following can be risky and should be done with care.

Update Frequency

Update frequency shows us whether or not a dependency is actively maintained. A dependency does not necessarily need to release updates frequently to indicate quality. Some dependencies don’t require frequent release because they’re mature and stable, but look for quick high priority bug fixes to indicate quality.

Financial Backing

In the open source world, financial backing is a nice to have, not a must have. Some dependencies grow so popular that they receive financing. Financing tends to indicate quality, but the lack of financing is not a deal breaker.

Manpower

Frequently, you’ll find a dependency maintained by a single core contributor. If that core contributor decides to abandon the dependency or lacks the time to maintain it, you now face a tough choice of dropping the dependency, keeping the dependency without maintenance, or maintaining it yourself. In some cases, another person forks and performs maintenance, but you shouldn’t count on that.

Licensing

Often, licensing gets overlooked, but an open source approved license is important. If the dependency does not have an open source approved license, then people might not be able to fork it and maintain it. Additionally, being able to view the code itself gives you insight into whether or not to implement it yourself.

Complexity

I see developers include new dependencies that don’t add a lot of functionality or new dependencies that could be replicated easily. Dependencies are burdens and should only be added if they truly save the developer time and effort.

Code Quality

Code quality can be measured in numerous different ways, but for the purpose of this checklist, we care about the dependency working as intended and not getting ourselves into dependency hell. Testing indicates whether the dependency works as intended and using the checklist to vet sub-dependencies helps prevent dependency hell.

Documentation

Quality documentation can be more important than the code itself. Well written documentation makes adopting a new dependency much easier and indicates the dependency’s quality. As a dependency grows, the documentation tends to drift out of sync with the code itself. Ensuring the documentation stays up to date with the code helps indicate the quality of the dependency.

Example Dependencies

Let’s use actual Python dependencies to walk through an evaluation using the checklist. We’ll use a simple rating scheme of positive, neutral, and negative with our checklist.

Django

Django is a Python web framework that takes a batteries-included approach to building a database-backed HTTP web service.

  • How many stars on GitHub?
    • Rating: positive

Django has over 50,0000 stars as of this writing which is a lot. Are GitHub stars the best measure of popularity? No, but they provide a good enough and easy to find measure.

  • How many downloads per month?
    • Rating: positive

Django had over 5,000,000 downloads the month previous to this writing. This shows me a lot of people use this dependency, and I can be confident adopting it.

  • What is the release cadence?
    • Rating: positive

Django has a formal release process which follows semantic versioning and provides long term support releases. Django releases bug fixes regularly and provides beta and alpha releases of upcoming versions.

  • How quickly are issues resolved and are they tracked in the release notes?
    • Rating: positive

Django has its own issue tracker and, generally, high priority bugs get resolved quickly. Django also has its own security policy to ensure sensitive, security related bugs won’t get disclosed publicly before a fix.

  • Does the project have funding?
    • Rating: positive

Django receives funding from many different sources including companies such as JetBrains, Sentry, and Rackspace.

  • Is the dependency backed by a legal entity?
    • Rating: positive

An open source software foundation supports Django and has a board of directors to oversee Django.

  • How many contributors does the project have?
    • Rating: positive

Django has close to 2,000 contributors as of this writing.

  • Does the project have a governance structure or a core team?
    • Rating: positive

Django has a fellow program which pays contractors to work on the Django project. Additionally, a board of directors and a core team manage Django.

  • Is the license open source friendly?
    • Rating: positive

Django’s source code is licensed under the BSD-3 Clause which is an open source approved license

  • Does the license allow forks?
    • Rating: positive

Yes, the BSD-3 Clause allows forks.

  • Could you implement the dependency yourself?
    • Rating: positive

Django is a large and complex project with plenty history behind it. I rated this positive because I don’t think I could implement Django. The positive rating signifies that using the dependency would save me a lot of time and effort implementing the functionality on my own.

  • Could you maintain a fork of the dependency if it’s abandoned?
    • Rating: negative

Again, Django is a large, complex project. I rated this negatively because if Django is abandoned I doubt that I could maintain a fork of it. Maintaining a fork would be time consuming. Even though Django is unlikely to be abandoned, I rate this negatively, because if Django is abandoned, that negatively impacts the maintainability of my project.

  • Does it have adequate test coverage and do the tests pass?
    • Rating: positive

Django has a comprehensive test suite and practices continuous integration.

  • How many dependencies does it have and what are the quality of those dependencies?
    • Rating: positive

As of this writing, Django 3.0 only has three hard dependencies with none of them having sub-dependencies. If you vet Django’s dependencies using this checklist, they should all pass.

  • Does the project have comprehensive documentation?
    • Rating: positive

Django has high quality documentation and keeps documentation for past versions. The documentation covers all aspects of the code and does deep dives into various topics.

  • Is the documentation kept up to date with the code?
    • Rating: positive

Django’s documentation keeps up to date with the code and includes notes around what version specific changes occur.

Verdict

Django is an extremely popular, well-maintained web framework for Python and easily passes the checklist easily. If Django fits into your project, including it is an easy decision.

Pendulum

Pendulum is a drop in replacement for the Python datetime class. It provides a cleaner and easier to use API for common datetime operations.

  • How many stars on GitHub?
    • Rating: positive

Pendulum has over 4,000 stars on GitHub as of this writing, and I believe that’s good enough for a positive rating.

  • How many downloads per month?
    • Rating: positive

In the month prior to this writing, Pendulum had over 2,000,000 downloads. To me, that demonstrates the package’s popularity and usage.

  • What is the release cadence?
    • Rating: neutral

Pendulum doesn’t have a formal release cadence, and large time gap exist between some releases. However, this could indicate a stable working package that doesn’t require many updates excluding the occasional bug fix.

  • How quickly are issues resolved and are they tracked in the release notes?
    • Rating: neutral

Looking at the issue tracker shows that issues do get closed relatively fast if they’re high priority bugs, and the package author responds promptly. However, open issues linger, and the backlog of pull requests and issues continues to grow.

  • Does the project have funding?
    • Rating: negative

Pendulum does not appear to be funded.

  • Is the dependency backed by a legal entity?
    • Rating: negative

Pendulum does not appear to be backed by a legal entity.

  • How many contributors does the project have?
    • Rating: negative

Pendulum has 64 contributors as of this writing. However, the overwhelming majority of the code is contributed by the author.

  • Does the project have a governance structure or a core team?
    • Rating: negative

Pendulum does not appear to have a core team. The project seems to be run by a single core contributor. However, the author appears to be calling for more maintainers

  • Is the license open source friendly?
    • Rating: positive

Pendulum’s source code is licensed under the MIT License which is an open source approved license.

  • Does the license allow forks?
    • Rating: positive

Yes, the MIT License allows forks.

  • Could you implement the dependency yourself?
    • Rating: neutral

Datetimes with time zones can become complex, and Pendulum deals a lot with time zones. I believe I could replicate parts of Pendulum, but it would not be trivial.

  • Could you maintain a fork of the dependency if it’s abandoned?
    • Rating: neutral

I could maintain a fork, but once again, it would not be trivial.

  • Does it have adequate test coverage and do the tests pass?
    • Rating: positive

Yes, Pendulum has good test coverage and a comprehensive test suite.

  • How many dependencies does it have and what are the quality of those dependencies?
    • Rating: positive

As of this writing, Pendulum has two dependencies with one sub-dependency. If you vet these, they should all pass.

  • Does the project have comprehensive documentation?
    • Rating: positive

Yes, the project has extensive documentation.

  • Is the documentation kept up to date with the code?
    • Rating: positive

Yes, generally speaking when the code changes, the documentation updates.

Verdict

Pendulum is one of the more popular datetime packages within the Python community. It’s not a no-brainer to include in your project, but if it fits your use case and enhances your project, I believe it’s worth including. The biggest problem that I see including Pendulum is with the author being the sole contributor. If the author decides to abandon the library, you would have to make a choice to drop it, fork it, or continue without maintenance.

Final Thoughts

I hope this article gives you pause before adding another dependency to your software project. I think that too frequently developers reach for a new dependency to fill a functionality gap when the developer could replicate the functionality. Dependencies put a long term maintenance burden on a project and should only be included when absolutely necessary. Hopefully, this article provides the tools to ensure the quality of your dependencies. If you have any questions or comments, feel free to reach out to me.

Steven Pate
Steven Pate
Founder

Senior Software Engineer with a focus on Python and Linux based solutions