1.6.12.1. grept
When managing Terraform modules at scale, we often need to ensure that every module repository adheres to unified specifications and best practices. The open-source tool grept from Azure was created for this very purpose. grept (named after Go REPository linTer) is a linting tool designed for repository governance. It parses configuration files, checks repository contents against predefined rules, and automatically generates modification plans or applies changes directly. Below, we will delve into the background and purpose of grept, its specific usage (combined with practical examples), and its critical role in a module governance framework.
1.6.12.1.1. Purpose and Background of grept
As the number of Terraform modules grows, maintaining consistency in structure and content across all module repositories becomes increasingly difficult. Manually checking each repository for complete files (such as ensuring the correct LICENSE, README, CI configuration, etc.) or urging maintainers to update template files themselves is not only time-consuming and laborious but also prone to oversight. grept emerged to solve these pain points.
The design philosophy of grept is to ensure code repositories adhere to established standards through automation. Inspired by tools like RepoLinter, it offers highly extensible repository linting capabilities. For the governance of Azure Terraform modules, grept provides a way to define rules centrally: we can stipulate in a central configuration which files, content, and settings must exist in all module repositories, and grept performs batch checks. This approach significantly reduces the burden on maintainers and guarantees the consistency of module repositories and the quality of the codebase.
In the Azure Verified Modules (AVM) project, grept has become a key component of the governance process. The Azure team stores all validation rules centrally in the Azure-Verified-Modules-Grept repository, and grept uses these configurations to scan individual module repositories. For instance, AVM defines files that need to be synchronized with the template repository for resource module repositories, such as LICENSE, code specification files, CI workflows, etc. With grept, the presence and content of these files can be automatically verified. Once a module repository lacks these files or falls behind in version, grept flags it and proposes a fix.
In summary, grept enables us to govern dozens or hundreds of Terraform module repositories in a standardized and repeatable manner, solving the problems of error-prone manual checks and difficulty in scaling. It ensures that every repository meets predefined compliance requirements, thereby maintaining consistency and high quality across repositories.
1.6.12.1.2. How to Use grept
From a user perspective, the usage process of grept is similar to the "plan" and "apply" phases of Infrastructure as Code tools like Terraform. We can define repository governance rules by writing HCL configuration files (with the extension .grept.hcl), and then run grept to check and apply these rules. Let's break down its usage into several aspects:
1. Preparing the grept Configuration: grept configuration uses HCL syntax and consists of three main components: data, rule, and fix.
- data blocks are used to collect necessary information. This can be reading file/directory information from the local repository or fetching data over the network. For example, you can use the built-in
httpdata source to fetch file content from a remote URL, or use thegithub_repository_teamsdata source to get a list of teams for a GitHub repository. - rule blocks define specific check conditions; each rule corresponds to a constraint that must be met. grept has built-in rule types such as
dir_exist(checks if a directory exists),file_hash(checks if file hashes match), andmust_be_true(custom check rules). A typical usage of a rule is to reference data collected bydatablocks to make a judgment. - fix blocks define the actions to take when a rule is not met. A
fixreferences a correspondingrule(viarule_ids) and executes when the rule fails. For example, alocal_filetype fix can replace the content of a missing or non-compliant local file, and agit_ignorefix can automatically add specified entries to.gitignore. Through the combination ofruleandfix,greptcan not only "discover" problems but also "fix" them automatically.
2. Writing Rule Examples: Let's illustrate how to write a configuration with a concrete example. Suppose we want to ensure that every module repository contains a standard MIT LICENSE file and that its content matches the official template. We can write the following grept configuration (excerpted from the actual rules of Azure Verified Modules):
data "http" "mit_license" {
url = "[https://raw.githubusercontent.com/Azure/terraform-azurerm-avm-template/main/LICENSE](https://raw.githubusercontent.com/Azure/terraform-azurerm-avm-template/main/LICENSE)"
}
rule "file_hash" "license" {
glob = "LICENSE"
hash = sha1(data.http.mit_license.response_body)
}
fix "local_file" "license" {
rule_ids = [rule.file_hash.license.id]
paths = [rule.file_hash.license.glob]
content = data.http.mit_license.response_body
}
The above configuration consists of three parts: data.http.mit_license fetches the standard MIT license text from a remote source; rule.file_hash.license calculates the hash of the LICENSE file in the current repository and compares it with the hash of the template content; fix.local_file.license specifies that if the rule fails (meaning the current repository LICENSE does not exist or the content does not match), the template text is written to the local LICENSE file. Through these blocks, grept can automatically check and fix the license file, keeping it synchronized with the standard template.
Similarly, we can write rules for other specifications that need enforcement, such as: verifying that the .gitignore file contains necessary ignore items and automatically adding them if missing; checking if the repository has CI/CD workflow configuration files and ensuring their content matches the latest template; checking for the existence of a code owners (CODEOWNERS) file and that it includes the correct teams/personnel, etc. The rich data sources and fix actions provided by grept make these checks relatively simple. For example, you can use data.git_ignore to get the current .gitignore list and combine it with fix.git_ignore to append entries automatically. All logic can be expressed in declarative HCL; we just need to write the rule configuration in advance.
3. Running grept plan: After writing the configuration, you can use grept to check the target repository. Execute grept plan [configuration folder path] in the command line. grept will load all configuration files ending in .grept.hcl in that directory and then analyze the current repository against the rules. Note that [configuration folder path] can be a local path or a remote reference (such as a Git repository address); grept supports loading remote configurations via HashiCorp go-getter syntax. This means we can host governance rules centrally in a repository and reference that remote path directly when using it, which is exactly what Azure Verified Modules does.
After executing grept plan, the tool will output a "plan". If all rules pass, grept will report something like "All rule checks successful, nothing to do.". If any rules are not met, grept will list the changes that need to be made. For example, it might point out that certain files are missing or contents do not match, and how it intends to fix these issues. This plan allows us to clearly understand what does not meet the specifications before actual modifications are made.
4. Running grept apply: After confirming the plan, we can execute grept apply [configuration folder path] to implement the changes. The apply command will make the necessary alterations to the repository according to the plan generated in the previous step. By default, grept may ask for confirmation before applying each fix, but we can use the -a or --auto parameter to enable auto-approval, allowing for batch application of all fixes without intrusion. When all fixes are successfully completed, grept will output a message like "Plan applied successfully.". After this step, the repository is automatically updated to a state that complies with all rules.
1.6.12.1.3. Real-World Analysis with Examples
To understand the usage of grept more intuitively, let's analyze the workflow of grept in Azure Terraform module governance with a real-world scenario. The Azure Verified Modules project uses a centralized grept configuration (stored in the Azure-Verified-Modules-Grept repository) to govern numerous module repositories. Let's take "Template File Synchronization" as an example to see how grept helps us solve problems in practice.
Scenario: Template File Synchronization. Azure officially provides a Terraform module template repository (terraform-azurerm-avm-template), which contains the standard files and configurations that a module repository should have. Over time, the template may be updated (e.g., adding more robust GitHub Actions workflow files or adjusting the README format), while existing module repositories may not keep up with these changes in time. In the traditional approach, we would need to manually notify each module maintainer to update repository files or send PRs to modify them one by one, which is extremely inefficient at scale. With grept, this process can be drastically simplified.
grept Check: In the central configuration, we have written corresponding rules for all critical template files. The LICENSE rule shown earlier is one such example. Another example is that we have rules to ensure every repository has the latest GitHub Actions workflow configuration file (e.g., .github/workflows/ci.yaml). The method is similar to the LICENSE: download the content of the corresponding file in the template repository via data "http", compare the hash value of the target repository file using rule "file_hash", and trigger fix "local_file" to update the file content if it doesn't match. For files that need to be added entirely, the rule can be designed to consider non-existence as a failure, and then the fix executes the addition.
AVM has currently integrated grept into the CI pipeline and Pre-Commit. For every Pull Request submitted, the pipeline runs grept, and subsequently checks if git status returns any file changes. If it does, the pipeline run is blocked. Since grept has already determined the remediation steps based on the configuration for every discrepancy, users only need to run make pre-commit before submitting to fix the issues.
1.6.12.1.4. grept Code Samples
Similar to Terraform, grept also supports for_each to implement multi-instance declarations, as shown in the following example:
locals {
synced_files = toset([
"_footer.md",
".github/CODEOWNERS",
".github/ISSUE_TEMPLATE/avm_module_issue.yml",
".github/ISSUE_TEMPLATE/avm_question_feedback.yml",
".github/ISSUE_TEMPLATE/config.yml",
".github/PULL_REQUEST_TEMPLATE.md",
".github/policies/avmrequiredfiles.yml",
".github/policies/eventResponder.yml",
".github/policies/scheduledSearches.yml",
".github/workflows/e2e.yml",
".github/workflows/linting.yml",
".github/workflows/version-check.yml",
".terraform-docs.yml",
"avm.bat",
"CODE_OF_CONDUCT.md",
"CONTRIBUTING.md",
"examples/.terraform-docs.yml",
"LICENSE",
"Makefile",
"SECURITY.md",
".editorconfig",
])
}
data "http" "synced_files" {
for_each = local.synced_files
request_headers = merge({}, local.common_http_headers)
url = "${local.url_prefix}/${each.value}"
}
rule "file_hash" "synced_files" {
for_each = local.synced_files
glob = each.value
hash = sha1(data.http.synced_files[each.value].response_body)
}
fix "local_file" "synced_files" {
for_each = local.synced_files
rule_ids = [rule.file_hash.synced_files[each.value].id]
paths = [each.value]
content = data.http.synced_files[each.value].response_body
}
Through for_each, we can conveniently reuse code blocks to perform checks and synchronization for every file listed in local.synced_files. grept intentionally maintains compatibility with most of Terraform's syntax; you can use almost all of Terraform's built-in functions, as well as locals, variable blocks, and so on.
Similar to Terraform's console command, grept also provides a console command, allowing users to debug various expressions in the code.
1.6.12.1.5. The Significance of grept in Governance Frameworks
Introducing grept into the Terraform module governance framework brings a qualitative improvement. Its significance is mainly reflected in the following aspects:
- Standardization and Consistency: With
grept, we can enforce unified repository specifications across the entire organization. All module repositories are constrained by the same set ofgreptrules, meaning that whoever creates or maintains a module must meet the same requirements. Standards no longer just sit in documentation; they are implemented as automated checks and remediation measures. This standardization greatly improves consistency, and module consumers benefit from the unified structure and processes. - Automated Verification and Correction: After translating governance requirements into code,
greptcan automatically and continuously verify whether repositories meet the requirements and propose correction plans immediately upon deviation. Compared to periodic manual repository audits,greptoffers higher timeliness and accuracy. Moreover, due to its auto-fix capability, many issues are corrected without human intervention. As seen in the AVM project practice, the central workflow ofgreptscans and fixes repositories according to plan, ensuring that "drift" is continuously governed. - Scalable Management: While manual management might be feasible for a few modules, it is almost unimaginable for dozens or hundreds of module repositories without the help of automated tools.
greptsupports performing the same check operations on any number of repositories by writing the configuration once. This means that as the number of modules grows, the governance workload does not increase linearly—new modules just need to apply the existing rules. This scalability is crucial, allowing the governance framework to handle the expansion of the module library with ease. - Reduced Maintenance Costs: After using
grept, the burden on module maintainers is actually reduced. Because many repository-level requirements are gated and preliminarily fixed by automated tools, maintainers do not need to memorize every detail of the specifications. Once there is non-compliance, they receive a PR reminder initiated bygrept; all they need to do is review and merge. This essentially automates the "find problem -> solve problem" flow, significantly reducing the cost of human communication and back-and-forth. Additionally, this allows the AVM central management team to quickly add and maintain Terraform configuration code that all AVM Terraform modules must have, for example, adding Telemetry collection code to all modules. - Centralized Management and Evolution of Governance Rules:
greptconfigurations are stored centrally and version-managed, allowing us to easily update governance rules and track changes. When new specifications (such as security compliance requirements or changes in company policy) need to be introduced, we only need to update the centralgreptconfiguration and let the tool run once, and all modules will receive the update. The mention that AVM is rolling out newgreptrules for repository settings is also implemented gradually through this centralized configuration modification. This model ensures the flexibility and continuous improvement capability of the governance framework. - Syntax Highly Compatible with Terraform:
greptwas deliberately designed to be highly compatible with Terraform syntax, so that technical personnel proficient in Terraform can very easily master the writing ofgreptpolicies.
In conclusion, grept plays the role of an automated supervisor and fixer in the large-scale governance of Terraform modules. Whether it is synchronizing file content or adjusting repository settings, as long as we define the standards and write the rules in advance, grept can execute these checks and modifications for us repeatedly and continuously. In the practice of large-scale module governance, it significantly reduces labor costs and avoids omissions caused by inconsistent manual operations. It transforms what used to be tedious manual work into an automated process, making governance work more precise, efficient, and sustainable. The successful experience of Azure Verified Modules shows that repository governance implemented through grept gives us higher confidence in the quality and specification of module libraries. When module repositories maintain consistency in structure and process and constantly conform to the latest standards, the reliability and maintainability of the entire module ecosystem are enhanced. The concept of "Governance as Code" embodied by grept will undoubtedly become an important trend and practice direction for large-scale infrastructure module management in the future.