Repository: bikeshedder/tusker Branch: master Commit: 7be4175038c5 Files: 8 Total size: 29.3 KB Directory structure: gitextract_r5z9r1w1/ ├── .gitignore ├── CHANGELOG.md ├── LICENSE ├── README.md ├── pyproject.toml ├── tusker/ │ ├── __init__.py │ └── config.py └── tusker.toml.example ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ .env tusker.toml tusker.egg-info/ __pycache__/ ================================================ FILE: CHANGELOG.md ================================================ # Change Log ## v0.5.1 * Fix error message for invalid backends * Fix validation of unique backends * Fix error when `--(un)safe` or `--(no)privileges` were not passed as arguments. * Replace `psycopg2-binary` dependency by `psycopg2` ## v0.5.0 * Added support for glob pattern lists for `schema.filename` and `migration.filename`. Plain strings are still supported. * Add support for interpolated environment variables within config files. * Deprecate `migrations.directory` configuration option. * Update `tomlkit` to version `0.11` * Update locked dependency versions ## v0.4.8 * Fix "`TypeError: dict is not a sequence" error when the schema or migration files contain percent characters (`%`). ## v0.4.7 * Fix "A value is required for bind parameter ..." error caused by SQL files containing code looking like SQLAlchemy parameters (`:`). ## v0.4.6 [YANKED] ## v0.4.5 * Add support for `\*\*` in glob pattern * Improve output of SQL errors ## v0.4.4 * Add default config for `migra` config section ## v0.4.3 * Fix `privileges` configuration option ## v0.4.2 * Add `migra.safe` and `migra.permission` to `tusker.toml` * Add `--safe` and `--unsafe` arguments * Add `--without-privileges` argument * Update `tomlkit` to version `0.10` * Update locked dependency versions ## v0.4.1 * Do not filter by `.sql` extension when using the `migrations.filename` setting. ## v0.4.0 * Add `migrations.filename` setting which supports a `glob` pattern * Fix error messages for invalid configurations * Increase minimum `python` version to `3.6` * Update `migra` to version `3.0` * Update `tomlkit` to version `0.7` * Update `sqlalchemy` to version `1.4` * Update `psycopg2` to version `2.9` ## v0.3.4 * Fix quoting of database names ## v0.3.3 * Add support for mixing url with other database settings ## v0.3.2 * Fix transaction handling ## v0.3.1 * Execute files specified by `glob` pattern in sorted order ## v0.3.0 * Add `--version` argument * Add `glob` pattern support for `schema.filename` setting ## v0.2.3 * Replace f-Strings by .format() calls. This fixes Python 3.5 support. ## v0.2.2 * Add support for `database.schema` config option ## v0.2.1 * Add `--with-privileges` option to `diff` and `check` commands. ## v0.2.0 * Add `from` and `to` argument to `diff` command which makes it possible to compare a schema file, migration files and an existing database. * Add `--reverse` option to `diff` command. * Add `check` command ## v0.1.2 * Fix closing of DB connections ## v0.1.1 * Escape schema and migration SQL before execution ## v0.1.0 * First release ================================================ FILE: LICENSE ================================================ This is free and unencumbered software released into the public domain. Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means. In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. For more information, please refer to ================================================ FILE: README.md ================================================ # Tusker [![GitHub](https://img.shields.io/github/license/bikeshedder/tusker?label=License&logoColor=white&style=for-the-badge)](https://github.com/bikeshedder/tusker/blob/master/LICENSE) [![PyPI](https://img.shields.io/pypi/v/tusker?label=PyPI&logo=pypi&logoColor=white&style=for-the-badge)](https://pypi.org/project/tusker) A PostgreSQL specific migration tool ## Elevator pitch Do you want to write your database schema directly as SQL which is understood by PostgreSQL? Do you want to be able to make changes to this schema and generate the SQL which is required to migrate between the old and new schema version? Tusker does exactly this. ## Installation ```shell pipx install tusker ``` Now you should be able to run tusker. Give it a try: ```shell tusker --help ``` ## Getting started Once tusker is installed create a new file called `schema.sql`: ```sql CREATE TABLE fruit ( id BIGINT GENERATED BY DEFAULT AS IDENTITY, name TEXT NOT NULL UNIQUE ); ``` You probably want to create an empty `migrations` directory, too: ```shell mkdir migrations ``` Now you should be able to create your first migration: ``` tusker diff ``` The migration is printed to the console and all you need to do is copy and paste the output into a new file in the migrations directory. Alternatively you can also pipe the output of `tusker diff` into the target file: ``` tusker diff > migrations/0001_initial.sql ``` After that check that your `schema.sql` and your `migrations` are in sync: ``` tusker diff ``` This should give you an empty output. This means that there is no difference between applying the migrations in order and the target schema. Alternatively you can run the check command: ``` tusker check ``` If you want to change the schema in the future simply change the `schema.sql` and run `tusker diff` to create the migration for you. Give it a try and change the `schema.sql`: ```sql CREATE TABLE fruit ( id BIGINT GENERATED BY DEFAULT AS IDENTITY, name TEXT NOT NULL UNIQUE, color TEXT NOT NULL DEFAULT '' ); ``` Create a new migration: ``` tusker diff > migrations/0002_fruit_color.sql ``` **Congratulations! You are now using SQL to write your migrations. You are no longer limited by a 3rd party data definition language or an object relational wrapper.** ## Configuration In order to run tusker you do not need a configuration file. The following defaults are assumed: - The file containing your database schema is called `schema.sql` - The directory containing the migrations is called `migrations` - Your current user can connect to the database using a unix domain socket without a password. You can also create a configuration file called `tusker.toml`. The default configuration looks like that: ```toml [schema] filename = "schema.sql" [migrations] filename = "migrations/*.sql" [database] #host = "" #port = 5432 #user = "" #password = "" dbname = "my_awesome_db" #schema = "public" [migra] safe = false privileges = false ``` Instead of the exploded form of `host`, `port`, etc. it is also possible to pass a connection URL: ```toml [schema] filename = "schema.sql" [migrations] filename = "migrations/*.sql" [database] url = "postgresql:///my_awesome_db_connection" ``` You can also use an environment variable in place of a hard-coded value: ```toml [database] url = "${DATABASE_URL}" ``` ## How can I use the generated SQL files? The resulting SQL files can either be applied to the database by hand or by using one of the many great tools and libraries which support applying SQL files in order. Some recommendations are: - NodeJS: [marv](https://www.npmjs.com/package/marv) - Rust: [refinery](https://crates.io/crates/refinery) ## How does it work? Upon startup `tusker` reads all files from the `migrations` directory and runs them on an empty database. Another empty database is created and the target schema is created. Then those two schemas are diffed using the excellent [migra](https://pypi.org/project/migra/) tool and the output printed to the console. ## Tusker is `unsafe` by default Unlike `migra` the `tusker` command by default does not throw an exception when a `drop`-statement is generated. Always check your generated migrations prior to running them. If you want the same behavior as migra you can either use the `--safe` argument or set the `migra.safe` configuration option to `True` in your `tusker.toml` file. ## FAQ ### Is it possible to split the schema into multiple files? Yes. This feature has been added in 0.3. You can now use `glob` patterns as part of the `schema.filename` setting. e.g.: ```toml [schema] filename = "schema/*.sql" ``` As of 0.4.5 recursive glob patterns are supported as well: ```toml [schema] filename = "schema/**/*.sql" ``` ### Is it possible to diff the schema and/or migrations against an existing database? Yes. This feature has been added in 0.2. You can pass a `from` and `to` argument to the `tusker diff` command. Check the output of `tusker diff --help` for more details. ### How can I export initial schema from an existing database? For exporting the initial schema you can use the native Postgres [pg_dump](https://www.postgresql.org/docs/current/app-pgdump.html) command with the `--schema-only` option. ### Tusker printed an error and left the temporary databases behind. How can I remove them? Run `tusker clean`. This will remove all databases which were created by previous runs of tusker. Tusker only removes databases which are marked with a `CREATED BY TUSKER` comment. ### What does the `dbname` setting in `tusker.toml` mean? The `dbname` setting in `tusker.toml` specifies database name to be used when diffing against your database. This command will print out the difference between the current database schema and the target schema: ```shell tusker diff database ``` Note that this command is meant to be run after you have migrated your database. Tusker also needs to create temporary databases when diffing against the `schema` and/or `migrations`. The two databases are called `{dbname}_{timestamp}_schema` and `{dbname}_{timestamp}_migrations`. The `dbname` setting overrides the database name in connection `url` (if specified). If neither a `dbname` nor a `url` is specified it will default to `tusker`. Calling `tusker diff database` only makes sense if you specify a `dbname` or your application does indeed use a database called `tusker`. ================================================ FILE: pyproject.toml ================================================ [tool.poetry] name = "tusker" version = "0.5.1" authors = ["Michael P. Jung "] license = "Unlicense" readme = "README.md" description = "A PostgreSQL specific migration tool" repository = "https://github.com/bikeshedder/tusker" homepage = "https://github.com/bikeshedder/tusker" classifiers = [ "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3 :: Only", "Development Status :: 4 - Beta", "Topic :: Database", "Topic :: Utilities", ] [tool.poetry.scripts] tusker = "tusker:main" [tool.poetry.dependencies] python = "^3.7" importlib-metadata = {version = "^1.0", python = "<3.8"} migra = "^3.0.1621480950" tomlkit = "^0.11" sqlalchemy = "^1.4.25" psycopg2 = "^2.9.5" [tool.poetry.dev-dependencies] [build-system] requires = ["poetry-core"] build-backend = "poetry.core.masonry.api" ================================================ FILE: tusker/__init__.py ================================================ import argparse from contextlib import contextmanager, ExitStack from glob import glob import sys import time import warnings import migra import psycopg2 from psycopg2 import sql import sqlalchemy from .config import Config TUSKER_COMMENT = ( 'CREATED BY TUSKER - If this table is left behind tusker probably ' 'crashed and was not able to clean up after itself. Either try ' 'running `tusker clean` or remove this database manually.' ) try: import importlib.metadata as importlib_metadata except ModuleNotFoundError: import importlib_metadata try: __version__ = importlib_metadata.version(__name__) except: __version__ = 'unknown' class ExecuteSqlError(Exception): pass def execute_sql_file(cursor, filename): with open(filename) as fh: sql = fh.read() sql = sql.strip() if not sql: return try: cursor.exec_driver_sql(sql.replace('%', '%%')) except sqlalchemy.exc.SQLAlchemyError as e: # https://github.com/sqlalchemy/sqlalchemy/blob/9e7c068d669b209713da62da5748579f92d98129/lib/sqlalchemy/exc.py#L699-L709 # To provide more detail on the underlying error, but without printing the original SQL. if e.orig: orig = e.orig error_text = "(%s.%s) %s" % (orig.__class__.__module__, orig.__class__.__name__, str(orig)) else: error_text = str(e) raise ExecuteSqlError('Error executing SQL file {}: {}'.format(filename, error_text)) class Tusker: def __init__(self, config: Config, verbose=False): self.config = config self.verbose = verbose self.conn = self._connect('template1') self.conn.autocommit = True def _connect(self, name): args = self.config.database.args(dbname='template1') return psycopg2.connect(**args) def log(self, text): if self.verbose: print(text, file=sys.stderr) @contextmanager def createengine(self, dbname=None): override = {'dbname': dbname} if dbname else {} engine = sqlalchemy.create_engine( 'postgresql://', connect_args=self.config.database.args(**override) ) try: yield engine finally: engine.dispose() @contextmanager def createdb(self, suffix): cursor = self.conn.cursor() now = int(time.time()) dbname = '{}_{}_{}'.format( self.config.database.args()['dbname'], now, suffix ) cursor.execute(sql.SQL('CREATE DATABASE {}').format( sql.Identifier(dbname) )) cursor.execute(sql.SQL('COMMENT ON DATABASE {} IS {}').format( sql.Identifier(dbname), sql.Literal(TUSKER_COMMENT) )) try: with self.createengine(dbname) as engine: yield engine finally: cursor.execute(sql.SQL('DROP DATABASE {}').format( sql.Identifier(dbname) )) @contextmanager def mgr_schema(self): with self.createdb('schema') as schema_engine: with schema_engine.begin() as schema_cursor: self.log('Creating original schema...') for filename in self._get_schema_files(): self.log('- {}'.format(filename)) execute_sql_file(schema_cursor, filename) yield schema_engine @contextmanager def mgr_migrations(self): with self.createdb('migrations') as migrations_engine: with migrations_engine.begin() as migrations_cursor: self.log('Creating migrated schema...') for filename in self._get_migration_files(): self.log('- {}'.format(filename)) execute_sql_file(migrations_cursor, filename) yield migrations_engine @contextmanager def mgr_database(self): with self.createengine() as database_engine: with database_engine.begin() as database_cursor: self.log('Observing database schema...') yield database_engine def mgr(self, name): return getattr(self, 'mgr_{}'.format(name))() def diff(self, source, target): self.log('Creating databases...') with self.mgr(source) as source, self.mgr(target) as target: self.log('Diffing...') migration = migra.Migration( source, target, self.config.database.schema, ) migration.set_safety(self.config.migra.safe) migration.add_all_changes(privileges=self.config.migra.privileges) return migration.sql def check(self, backends): with ExitStack() as stack: managers = [(name, stack.enter_context(self.mgr(name))) for name in backends] for i in range(len(managers)-1): source, target = (managers[i], managers[i+1]) self.log('Diffing {} against {}...'.format( source[0], target[0] )) migration = migra.Migration( source[1], target[1], schema=self.config.database.schema ) migration.set_safety(self.config.migra.safe) migration.add_all_changes(privileges=self.config.migra.privileges) if migration.sql: return (source[0], target[0]) return None def clean(self): cursor = self.conn.cursor() try: cursor.execute(''' SELECT db.datname FROM pg_database db JOIN pg_shdescription dsc ON dsc.objoid = db.oid WHERE dsc.description = %s; ''', (TUSKER_COMMENT,)) rows = cursor.fetchall() for row in rows: dbname = row[0] self.log('Dropping {} ...'.format(dbname)) cursor.execute(sql.SQL('DROP DATABASE {}').format( sql.Identifier(dbname) )) finally: cursor.close() def _get_schema_files(self): for pattern in self.config.schema.filename: yield from sorted(glob(pattern, recursive=True)) def _get_migration_files(self): for pattern in self.config.migrations.filename: yield from sorted(glob(pattern, recursive=True)) def cmd_diff(args, cfg: Config): tusker = Tusker(cfg, args.verbose) source = args.source target = args.target if args.reverse: source, target = target, source try: sql = tusker.diff(source, target) print(sql, end='') except ExecuteSqlError as e: print(str(e), file=sys.stderr) sys.exit(1) def cmd_check(args, cfg: Config): backends = args.backends if 'all' in backends: backends = ['migrations', 'schema', 'database'] tusker = Tusker(cfg, args.verbose) try: diff = tusker.check(backends) except ExecuteSqlError as e: print(str(e), file=sys.stderr) sys.exit(1) if diff: print('Schemas differ: {} != {}'.format(diff[0], diff[1])) print('Run `tusker diff` to see the differences') sys.exit(1) else: print('Schemas are identical') sys.exit(0) def cmd_clean(args, cfg: Config): tusker = Tusker(cfg, args.verbose) tusker.clean() BACKEND_CHOICES = ['migrations', 'schema', 'database'] class ValidateBackends(argparse.Action): def __call__(self, parser, args, values, option_string=None): if 'all' in values: values = BACKEND_CHOICES else: if len(values) <= 1: choices = ', '.join(map(repr, BACKEND_CHOICES)) raise argparse.ArgumentError( self, ( 'at least two backends are required to perform ' 'the check (choose from {choices}) or pass \'all\' ' 'on its own.'.format(choices=choices) ) ) backends = set() for value in values: if value not in BACKEND_CHOICES: choices = ', '.join(map(repr, BACKEND_CHOICES + ['all'])) msg = 'invalid choice: {!r} (choose from {})'.format( value, choices ) raise argparse.ArgumentError(self, msg) if value in backends: msg = 'duplicate found in backend list: {}'.format(value) raise argparse.ArgumentError(self, msg) backends.add(value) setattr(args, self.dest, values) def add_migra_args(parser): g = parser.add_mutually_exclusive_group() g.add_argument( '--safe', help='throw an exception if drop-statements are generated.', action='store_const', dest='safe', const=True, ) g.add_argument( '--unsafe', help='don\'t throw an exception if drop-statements are generated.', action='store_const', dest='safe', const=False, ) g = parser.add_mutually_exclusive_group() g.add_argument( '--with-privileges', help='output privilege differences (ie. grant/revoke statements).', action='store_const', dest='privileges', const=True, ) g.add_argument( '--without-privileges', help='don\'t output privilege differences.', action='store_const', dest='privileges', const=False, ) def main(): if not sys.warnoptions: warnings.simplefilter("default") parser = argparse.ArgumentParser( description='Generate a database migration.') parser.add_argument( '--version', action='version', version='%(prog)s {}'.format(__version__)) parser.add_argument( '--verbose', help='enable verbose output', action='store_true', default=False) parser.add_argument( '--config', '-c', help='the configuration file. Default: tusker.toml', default='tusker.toml') subparsers = parser.add_subparsers( dest='command', required=True) parser_diff = subparsers.add_parser( 'diff', help='show differences between two schemas', description=''' This command calculates the difference between two database schemas. The from- and to-parameter accept one of the following backends: migrations, schema, database ''') parser_diff.add_argument( 'source', metavar='from', nargs='?', help='from-backend for the diff operation. Default: migrations', choices=BACKEND_CHOICES, default='migrations') parser_diff.add_argument( 'target', metavar='to', nargs='?', help='to-backend for the diff operation. Default: schema', choices=BACKEND_CHOICES, default='schema') parser_diff.add_argument( '--reverse', '-r', help='swaps the "from" and "to" arguments creating a reverse diff', action='store_true') parser_diff.add_argument( '--create-extensions-only', help='Only output create extension statements, nothing else. ', action='store_true', ) add_migra_args(parser_diff) parser_diff.set_defaults(func=cmd_diff) parser_check = subparsers.add_parser( 'check', help='check for differences between schemas', description=''' This command checks for differences between two or more schemas. Exit code 0 means that the schemas are all in sync. Otherwise the exit code 1 is used. This is useful for continuous integration checks. ''') parser_check.set_defaults(func=cmd_check) parser_check.add_argument( 'backends', help=( 'at least two backends are required to diff against each other ' '(choose from {}). You can also pass \'all\' on its own to diff ' 'all backends against each other.' ).format( ', '.join(map(repr, BACKEND_CHOICES)) ), metavar='backend', nargs='*', default=['migrations', 'schema'], action=ValidateBackends ) add_migra_args(parser_check) parser_clean = subparsers.add_parser( 'clean', help='clean up left over *_migrations or *_schema tables') parser_clean.set_defaults(func=cmd_clean) args = parser.parse_args() if hasattr(args, 'source') and hasattr(args, 'target') and args.source == args.target: parser.error('to- and from-backend must not be identical') cfg = Config(args.config) if getattr(args, 'safe', None) is not None: cfg.migra.safe = args.safe if getattr(args, 'privileges', None) is not None: cfg.migra.privileges = args.privileges args.func(args, cfg) ================================================ FILE: tusker/config.py ================================================ import os import re from psycopg2.extensions import parse_dsn from tomlkit.toml_file import TOMLFile class Config: def __init__(self, filename=None): env = os.environ filename = filename or 'tusker.toml' toml = TOMLFile(filename) try: data = toml.read() except FileNotFoundError: data = {} # time to validate some configuration variables data.setdefault('database', {'dbname': 'tusker'}) data.setdefault('schema', {'filename': ['schema.sql']}) data.setdefault('migrations', {'filename': ['migrations/*.sql']}) data.setdefault('migra', {'safe': False, 'privileges': False}) self.schema = SchemaConfig(data['schema']) self.migrations = MigrationsConfig(data['migrations']) self.database = DatabaseConfig(data['database']) self.migra = MigraConfig(data['migra']) def __str__(self): return 'Config(schema={}, migrations={}, database={}, migra={})'.format( self.schema, self.migrations, self.database, self.migra ) def replace_from_env_var(matchobj): env_variable = matchobj.group(1) try: return os.environ[env_variable] except KeyError: raise ConfigError.missing_env(env_variable) class ConfigReader: def __init__(self, data, path): self.data = data self.path = path def get(self, name, type, required=False, default=None): if name not in self.data: if required: raise ConfigError.missing('{}.{}'.format(self.path, name)) else: return default value = self.data[name] if isinstance(value, str): # Replace any environment variables value = re.sub(r"\${([a-zA-Z_][a-zA-Z_0-9]*)}", replace_from_env_var, value) if not isinstance(value, type): raise ConfigError.invalid(name, 'Not of type {}'.format(type)) return value def get_list(self, name, required=False, default=None): value = self.get(name, (str, list), required, default) if isinstance(value, str): value = [value] else: if value and not all(isinstance(x, str) for x in value): raise ConfigError.invalid(name, 'Not a list of strings {}'.format(value)) return value class SchemaConfig: def __init__(self, data): data = ConfigReader(data, 'schema') self.filename = data.get_list('filename', default=['schema.sql']) def __str__(self): return 'SchemaConfig({!r})'.format(self.__dict__) class MigrationsConfig: def __init__(self, data): data = ConfigReader(data, 'migrations') directory = data.get('directory', str, False) if directory: import warnings warnings.warn( 'The "migrations.directory" configuration option is ' 'deprecated and support for this option will be removed ' 'in the next version of tusker. Please replace this by ' 'the "migrations.filename" option which does support ' 'globbing patterns.', DeprecationWarning, stacklevel=2 ) filename = data.get_list('filename') if filename: raise ConfigError.invalid( 'migrations directory and filename parameters ' 'are mutually exclusive', ) else: self.filename = ['{}/*.sql'.format(directory)] else: self.filename = data.get_list('filename', default=['migrations/*.sql']) def __str__(self): return 'MigrationsConfig({!r})'.format(self.__dict__) class DatabaseConfig: def __init__(self, data): data = ConfigReader(data, 'database') self.url = data.get('url', str) self.host = data.get('host', str) self.port = data.get('port', int) self.dbname = data.get('dbname', str) self.user = data.get('user', str) self.password = data.get('password', str) self.schema = data.get('schema', str) def __str__(self): return 'DatabaseConfig({!r})'.format(self.__dict__) def args(self, **override): if self.url: args = parse_dsn(self.url) else: args = {} for k in ['host', 'port', 'dbname', 'user', 'password']: v = getattr(self, k) if v is not None: args[k] = v if not args['dbname']: args['dbname'] = 'tusker' args.update(override) return args class MigraConfig: def __init__(self, data): data = ConfigReader(data, 'migra') self.safe = data.get('safe', bool, default=False) self.privileges = data.get('privileges', bool, default=False) class ConfigError(RuntimeError): @classmethod def missing_env(cls, env_variable): return cls('Missing environment variable: {}'.format(env_variable)) @classmethod def missing(cls, name): return cls('Missing configuration: {}'.format(name)) @classmethod def invalid(cls, name, reason): return cls('Invalid configuration: {}, {}'.format(name, reason)) ================================================ FILE: tusker.toml.example ================================================ [schema] filename = "schema.sql" [migrations] filename = "migrations/*.sql" [database] #host = "" #port = 5432 #user = "" #password = "" dbname = "my_awesome_db" #schema = "public" [migra] safe = false privileges = false