Version control for Excel financial models

This product requirement document is one I prepared as part of the application process for a FinTech organization. It played a pivotal role in securing my position with the company.

Objectives

Develop an Excel-centric version control system with the primary objective of fostering collaboration across diverse industries, enabling collective contributions to both new and existing business software solutions through knowledge sharing. This system streamlines optimization using a no-code approach and plays a crucial role in precisely monitoring individual contributions and managing comprehensive file versions. The platform offers additional functionalities, such as visualizing differences in file versions, tracking changes, and facilitating Excel model reviews.

Value Preposition Canvas

The Value Proposition Canvas places the customer at the center of product development, which helps to create customer-centric solution by understanding customer needs, problems, and gains. It ensures a strong fit between the product and the market, and helps aligning the solution to the problem.

Success Metrics

Goals

Metrics

A cloud-based platform which let Excel users group to host the Excel workbook (business logics/solutions).

Allows Excel users group to operate with the deep file versioning controls via git, without the advanced understanding on the low-level git versioning commands (no-code way) as a general software developer does.

Allow clients to meet their governance and compliance requirements for managing their Excel models in an automated and pragmatic way

Allow Excel users group to build the collaborative / community environment, and let them to exchange stories, experience and share solution output with appropriate access control.

Assumptions & Dependencies

Incorporate xltrail as the backend versioning system?

xltrail (https://www.xltrail.com/) is a Git(versioning) extension that understands the VBA code inside the workbook.  In this product specification, product manager assumed the benefits of integrating xltrail as the core control in this product outweights the possible risks in the future.

PROs

CONs

Implement file versioning control for Excel workbooks without relying on xltrail or any other third-party libraries?

Issues

Functional requirements

Possible solutions

1

Serialise Excel Workbook data into text format so the changes can be tracked in the versioning control.

(1) Backend serialisation automatically when user commit a new version.

(2) Support necessary cell properties (Content, formula, defined names, cell Notes, power queries?, vba modules?)

(3) Split cell contents for each sheet and Excel macros into 2 separate tracked contents.

"As a contributor to the workbook, I aim to integrate my progressions into the file versioning system, enabling me to track every version I've established throughout my work progress, and utilize content comparison functionality."

Method (1): Use Python 

Options: python-oletools, Openpyxl, Pandas, Xlwings 

(Open source, versatile, code faster)

Method (2): C# Options:   Microsoft.Office.Interop.Excel

(OOP, organised, perform faster) 

2

Present file difference to users.

(1) visualise the gif diff result (Difference between 2 cells) into the Excel-like layout which is easy/clear for non-developer to read

(2) UI presents all differences , e.g. cell content, formula, defined names etc)

"As a workbook contributor, I desire a web UI layout similar to that of a spreadsheet within the solution, allowing me to conveniently view my work content without the need to download it to my desktop."

"As a workbook contributor, I aim for the Spreadsheet-like UI layout to display and highlight content differences between two file versioning branches. This feature will enable me to review the updates I've made in my copy compared to the current roll-out version."

Steps:

(1) Complete a Excel-like layout (Single sheet, web-based).  Deserialise the text structure into the workbook layout.

(2) Run git diff to reads the difference between the source and the updated file.

(3) Combing the 2 versions into one text structure and highlight the difference based on text  syntax.

(4) Present the git diff in the UI layout (green for new/updated content, red for removed content)

3

Resolve merge conflicts

(1) User friendly and self-explained UI.

(2) Provide easy options to non-technical users to resolve the conflict(s), 

"As a workbook contributor, I require a solution that offers straightforward conflict resolution methods, allowing me to quickly address conflicts, whether through automated or manual means, without the need for an in-depth understanding of merge conflicts."

Steps:

(1) Use the Excel-like layout to highlight the cell which has the merge conflict.  When mouse point hover to the cell then show “Mine” version and “Theirs” version.

(2) In the cell content, give option (Right-click) to user to resolve the conflict by 

(3) Create a “Merge” button to confirm all conflicts are settled, and then merge file.

Prior the “Merge” process:

Option: “Compile” the workbook and make sure no error, e.g.#VALUE!, #REF!, #NULL!.


Option: re-run the unit tests to confirm the correct output before merge, present possible outcome to users.

Solution References:

“Get Your Spreadsheets Under (Version) Control”: https://web.fe.up.pt/~jacome/downloads/CIBSE19a.pdf

“SheetGit: A Tool for Collaborative Spreadsheet Development”: https://spreadsheetsunl.github.io/sheetgit/paper.pdf

https://tech.marksblogg.com/git-track-changes-in-media-office-documents.html 

https://blog.golayer.io/excel/excel-version-control 

https://github.com/LibreOffice/mso-dumper/blob/master/xls-dump.py 

ExcelCompare: https://github.com/na-ka-na/ExcelCompare 

What Excel workbook format should be supported in this product?

Business model

Since this is a SaaS platform, we will provide various pricing plans for client to subscribe.  Each pricing plan will have restricted functionalities based on 


Reference: Github pricing https://github.com/pricing 

Backward compatibility?

In the current platform, previous version of workbooks are stored individually.  Applying file versioning control in the old files before this product release is NOT in the scope.

Scope

Github is a powerful platform which provides a lot more services than basic code hosting and sharing. I’m using Github as a reference for this product but this product might not implement all Github features since they are all decided based on customer use cases and make sure the product aligns with the product vision.

MUST HAVE

NICE TO HAVE

NOT IN SCOPE

Roadmap

Roadmap provides a structured, forward-looking plan that aligns that aligns the product's development with business goals, and ensures efficient resource allocation and prioritization.

Specifications

Important Note: Priority codes play a crucial role in organizing and prioritizing tasks, indicating the urgency level of a project. These codes, ranging from P0 to P4, help establish priorities, with P0 representing the highest urgency and P4 the lowest.

Epic

Functional requirements

Priority (P0-P4)

1

INFRA

"As the Product Manager, I need to self-host git on the existing SaaS platform so it can track changes for the incoming workbooks."

P0

2

INFRA

"As the platform admin, I need to set up a storage space in the cloud platform for each company workspace with fixed storage size (2GB) so that clients to host solutions (Excel workbook) in the SaaS platform."

P0

3

INFRA

"As the platform admin, I need to have an option to configure storage space based on client’s subscription."

P2

4

GIT

"As the Product Manager, I need a function to convert the incoming workbook file into text format so the platform can track changes via git.  Focus on these type of cell formats:

P0

5

GIT

"As the Product Manager, I need a function to convert the incoming workbook file into text format so the platform can track changes via git.  Work on these types of cell formats:

P1

6

GIT

"As the solution contributors, I need a function to create new branch from a solution and then switch to the new branch, so my modifications won’t affect the master branch."

P2

7

GIT

"As the solution contributors, I need a web-based file diff screen to view the file difference between the two branches, so I can choose which version of Excel cell formulas that I should keep before merge."

P1

8

USER

"As the platform admin, I need to create a number of user accounts on the platform, registered by First Name, Last Name and Email, so to grant access to the solution for platform users."

P0

9

USER

"As the platform admin, I need to create 3 user roles (Admin, Contributor, Viewer) to be assigned to each user account so to configure access controls for each user account.

P3

10

SOCIAL

"As the solution viewers, I need to create a discussion thread within the solution space so I can share my enquiries and feedback and let others to express their thoughts."

P4

11

SOCIAL

"As the solution contributor, I need to have a function to share my solution to other users and notifiy them."

P1

User Interactions & Design

Open Questions

Web-based repository source-code hosting services for software development and version control using Git.

2. Why didn’t we just integrate Github for Excel directly?

3. “How does the product helps to educate non-technical users on learning Git and understand the usage certain Git functions, e.g. merge?” 

Educate “Git merge in Excel” to make the UX be more outcome-focused.


Prior the “Merge” process:

4. The workbook has a combinations of Xinput and Xoutput cells.  Certain cell content should not be tracked.  What would you suggest to handle this? 

Reference from the .gitignore configurations.

It’s possible to build a .cellignore in this product that it ignores the cell content that should not be tracked, especially those XInput cells.

Criteria: