Pyinfra: Automate Infrastructure Using Python

644 points
1/20/1970
18 days ago
by InitEnabler

Comments


Fizzadar

Hey all, I'm the creator/primary maintainer of pyinfra! Super excited (a little terrified) to see this on the frontpage, happy to answer any questions :)

I also hang out on the Matrix room: https://matrix.to/#/#pyinfra:matrix.org

Another thing: the GH repo points at currently in beta v3 and the docs for this are here: https://docs.pyinfra.com/en/next (highly recommend starting with v3, I just haven't had any time recently to wrap up the release, but it's stable).

18 days ago

negus

As you can see here, the main question is what are the advantages over Ansible, a mature and the most popular agentless configuration management tool written in Python. So I propose putting this answer right to the landing page

18 days ago

Fizzadar

I think I tried to shy away from specifically being "Ansible does this bad so pyinfra does this" and instead focus on the features that differentiate like "Instant debugging with realtime stdin/stdout/stderr output (-vvv).". But it seems like that isn't enough and the landing page needs to be more explicit in comparison. Ty for the feedback!

18 days ago

gh02t

I applaud trying to be positive and focus on "this is what we do well," but yeah at least some explicit comparison would help. The copy right now is kind of assuming the reader already knows Ansible to compare against as a baseline. Which is probably fair for most people who find your project, but people who find your project are also probably not happy with Ansible and want to know if this addresses their pain points immediately.

Is very interesting though, I think I'm gonna try it myself.

17 days ago

Anarch157a

I like Ansible, but that doesn't mean thera are no pain points.

One of them is handling "if-this-then-that-else-that". Being purely declarative, Ansible is horrible at that.

Pyinfra can be used in imperative mode, am I right? This would make the use of if-else a breeze, which would be a really good reason for me to to switch.

17 days ago

tryauuum

Ansible is declarative?

In puppet and saltstack you can declare that a folder is empty and declare a specific file in this folder. The system's smart enough to delete all the files except the one.

To achieve such feat in ansible is hard. Easiest way is to have two tasks, one deletes everything and second recreates your file. Doesn't feel very declarative

Unrelated thing, they don't even try to be declarative in ansible E.g you can have a file with state "touch". It not a state if it updates each playbook run!

17 days ago

Anarch157a

You're confusing declarative with idempotent. Ansible is both, it won't change anything if the state is already what you declared. The nitpicked case you chose, you want the file to have the latest timestamp, this is a valid state to declare.

17 days ago

soraminazuki

Ansible is neither and it shows. For example, deleting a configuration in Ansible doesn't revert the system back to its original state. That's the definition of statefulness.

16 days ago

verdverm

The shell task breaks the declarative nature a bit, along with registering task results and then writing conditional whens based on them. Interpolating values based on the registered results does too imho

17 days ago

aflukasz

Plus, what your are declaring is often times not the state you desire but... action you want to take. Say you use `apt.name: [pkg1, pkg2]`, run it, then remove `pkg2` from the list. Running this again won't remove `pkg2` from your system. So it's declarative, but not necessarily on the optimal level all the times.

17 days ago

otabdeveloper4

Ansible tries to be declarative and idempotent, but fails at being both.

17 days ago

gtirloni

ansible-core succeeds pretty well in being both.

ansible modules (built-in and community) are a different matter. generally, quality will vary.

17 days ago

goodpoint

Ansible is neither.

17 days ago

mrweasel

There's a lot of cases where Puppet fails on something similar. Packages is probably the easiest example. Puppet won't remove all package not explicitly declared, nor will it remove files it created, if you remove the Puppet code for managing that file.

I'm fairly sure that the way to make Puppet do what you suggest is the same in Puppet and Ansible. The difference is that Puppet is smart enough to not actually remove you file during every run (I think). On the other hand, Ansible will normally not be configured to run every 30 minutes like Puppet, so it's much less of an issue.

Both tools are great but they work some what differently. How you think about using them is much the same though, you need to tell the computer what to do. In Puppet this is often talked about is if you describe the state, I suppose that's partly true, but in the end you have a series of actions the computer will need to take to achieve this state.

16 days ago

zaphirplane

You can use Rsync to to a puppet created folder You’d need to have a source folder with the file which is fine

17 days ago

emmelaich

rsync covers this and should ne used instead of a lot of Ansible tasks

16 days ago

verdverm

There's definitely a trend in this direction.

Pulumi for Terraform and Dagger for Docker are two examples I use

I like CUE as a language to replace my Yaml that has some of the typical language constructs but maintains the declarative approach

17 days ago

erikbye

Is performance better than Ansible? I have used Ansible extensively and find it excruciatingly slow.

17 days ago

mxuribe

Hi @Fizzadar and congrats on making this and getting it out the door; kudos!

As you craft your "Why this and not Ansible" content, you might actually state clearly what you already noted on the Performance page, namely: "One of the reasons pyinfra was started was performance of agent-less tools at the time." If I read that, it'd instantly make me want to stick around and read some more, play with pyinfra, etc. BTW, i will be playing with it anyway, but just wanted to point out that you likely won;t need to start from scratch for copy (on a comparison or answering "Why this and not Ansible" content). Cheers!

17 days ago

helsinki

Excruciatingly slow is an understatement :)

17 days ago

surfingdino

Why being slow is a bad thing? Ansible gives me a legitimate excuse to have proper lunch. ;-)

17 days ago

verdverm

You're supposed to be writing compilers during that time

17 days ago

ShakataGaNai

That would be appreciated. I saw the homepage and my first thought was "Ansible is python. How are these things different?" Obviously pure python vs yaml is one thing. But beyond that it's not clear. Perhaps are specific use cases in your mind where one or the other is a better fit, and that would be helpful as well.

17 days ago

ransom1538

Stupid question. Ansible is mentioned TWO times in HN hiring thread, it will soon be zero. Isn't this dated tech? Ansible is good at patching servers (have cute names, make sure they are patched). Patching??? But, why not use containers? Aren't we moving away from Ansible/Puppet? I would much rather have AWS/CDK or pythoninfra like this if I was into pet machines.

16 days ago

jethro_tell

Someone somewhere has to set up machines so you can do containers. Doesn't name an they are pets.

16 days ago

ransom1538

You mean terraform?

15 days ago

jethro_tell

No, before terraform, someone has to boot a machine and slap an image on it or do an OS install, Register the host in some way and have it checking for the terraform.

I use ansible for creating machine images or initial provisioning. (I don't run the ansible, someone racks the host, sets it's build state to install, and boots the host and it joins the appropriate cluster and people do container things. I don't necessarily know when my ansible runs against a host.

I also have a pretty good stack of ansible playbooks that I use manually day to day for hardware validation for new server models and one off type stuff. But again, I never really know what I'm running against or have pet servers.

A good chunk of hardware validation runs automatically if the boot target is set to hw-validate, but the whole point is that you are gonna find stuff that doesn't work with your standard process and either pass on it or adjust.

I do run tf to provision cloud infra so its transparent to the devs, and, honestly, not sure how ansible is dated and tf is not, they are pretty much the same thing in a different coat.

And honestly, generating thousands of lines of conflicting generic yaml isn't really much of an improvement over writing it once and running it automatically on 1000s of boxes.

13 days ago

OJFord

Is it declarative? Obviously python isn't, but since it's not executed as a script but rather the module passed to pyinfra, it could be, and looks like maybe it is just registering work to (potentially) do on module load?

If so, nice, shout about it more - it's my number one requirement of such a tool, why I think Terraform (or OpenTofu) is great and mostly everything else sucks, and I think it should be everyone's. It's just obviously (at least, once someone makes it available!) the correct paradigm for managing stateful resources and coping with drift.

18 days ago

Fizzadar

Yes... and no. It depends on the operation (the docs explicitly state if an operation is _not_ idempotent "stateless operation"). Operations are either:

- state definitions, "ensure this apt package is installed" (apt.packages: https://docs.pyinfra.com/en/next/operations/apt.html#operati...) - stateless, "run these shell commands" (server.shell: https://docs.pyinfra.com/en/next/operations/server.html#oper...)

Most operations are state definitions and much preferred, the stateless ones exist to satisfy edge cases where either the state-ful version isn't implemented or simply isn't possible.

18 days ago

tetha

Ah, so this is similar to the Terraform CDK approach?

In Terraform CDK, you use a language like python to compute the set of resources and such you want to have, and then hand that over to the terraform core, which does the usual terraform song and dance to make it happen.

This is actually interesting to me, because we struggle with even the simplest data transformations in ansible so much. Like, as soon as you start thinking about doing a simple list comprehension in python in jinja templating, lots and lots of pain starts. From there, we're really starting to think about templating the inventories in some way because it would be less painful.

18 days ago

Fizzadar

Interesting, not heard of CDK before! Kind of similar? As long as the language is Python I suppose! Would be possible to integrate with other languages too I guess, not something I’ve ever looked into though.

Totally agree on templating which is why inventories have always been python code just as operations, giving maximum flexibility (with some complexity/type confusion drawbacks).

17 days ago

4m1rk

CDK just generates AWS CloudFormation files (There is CDK for TF as well as said before, but is not the AWS official implementation). It's like having a few constructs to allow you generate a YAML or JSON file.

16 days ago

F-Lexx

Relevant XKCD: https://xkcd.com/303/

17 days ago

peteradio

I never understood the desire to make things declarative. It seemed to me to always hide what is actually happening and it made it more difficult to understand. Is there a simple way to understand why declarative stuff is desirable to some people?

18 days ago

NortySpock

It lets you separate "here is the final state of the system that I want" from "how to get there".

If a SQL compiler or `terraform plan` command can convert "the current state of the system" + "desired end-state" to a series of steps that constitute "how to get there from here", then I can usually just move forward to declaring more desired states after that, or debugging something else, etc. Let the computer do the routine calculations.

When using a path-finding / route-finding tool, having the map and some basic pathfinding algorithms already programmed in means we no longer need to "pop a candidate route-segment off the list of candidates and evaluate the new route cost"... I simply observe that I am "probably here" and I wish to get to "there"; propose a route and if it's good enough I'll instruct the machine to do that.

If I can declare that I want the final system to contain only the folder "/stuff/config.yaml" with permissions 700 -- I don't care what the contents of stuff were previously, and if it had a million temp files in it from an install going sideways or the wrong permissions or a thousand nested folders in it, well, it would be great if the silly computer had a branching workflow that detected and fixed that for me, rather than me having to write yet another one-off script to clean up yet another silly mis-configured system that Bob left as a dumping ground that I have to write yet more brittle bizarre-situation-handling code for.

Same for SQL and data. "Look, Mr. Database, I don't actually know what's in the table today, and I don't know why the previous user dumped a million unrelated rows in the table.... Can you answer my query about if my package has shipped, or not?"

17 days ago

peteradio

It sounds like you need a shim between every single dependency to make its setup work declaratively. That sounds like a versioning nightmare to rely on and was borne out in my experience with it.

16 days ago

michaelmior

I think the main reason people like the declarative approach is that done right, it's idempotent. You also don't have to think about the current state of the system at all. You just need to describe what you want it to look like. Of course in practice it can be more nuanced than that, but thinking declaratively can make things much simpler in some scenarios.

18 days ago

Spivak

These aren't as related as you think. Ansible is imperative and idempotent.

17 days ago

bjt12345

Ansible is only as idempotent as the module it calls though.

17 days ago

windexh8er

Ansibles' resources are declarative [0]. What part of Ansible is imperative?

https://docs.ansible.com/ansible/latest/reference_appendices...

17 days ago

matrss

Ansible playbooks are usually a list of steps to execute in order (imperative). Those steps may try to present a declarative interface to what they are supposed to do, but many fail to fulfill the definition of "declarative" that you have linked. E.g. with the built-in modules it is impossible to declare a desired set of installed packages, only a set of packages to be installed _in addition to all already installed packages_. This means it is impossible to remove an installed package again by removing it from the declaration, you have to specify a second step (imperative) that explicitly removes the package. This makes it impossible to declare a final state for "installed packages" with ansible.

17 days ago

XorNot

This is debating state-management though, which Ansible makes the correct choice about: Ansible largely works the way a user expects when they transition to it from doing things on the command line, and guides them towards idempotency (which is a pre-requisite for declarative configuration).

The problem is to track deletions you either have to constantly have a view of global state (i.e. do you want to put `linux-kernel` in your package list?) or you need to store specific state about that machine (i.e. `redis` was installed by playbook redis-server.yml, task "install redis") - because the packages absence in that list doesn't necessarily mean "uninstall it" if something else in another playbook or task will later declare it should be present.

As soon as you're trying to do deletions, you're making assumptions that the view of the state you have is complete and total and that is usually not the case - and even if it is within the scope of your system, is it the case on the system you're interacting with? Do you know every package that should be installed because it comes out of the box in the distro? Do you want to (aka: do you have the time, resourcing and effort to do this for the almost zero gain it will get you in the short term unless you can point to business outcomes which are fulfilled by the activity?)

17 days ago

jacobr1

> As soon as you're trying to do deletions, you're making assumptions that the view of the state you have is complete and total and that is usually not the case

terraform does this, which is why it tracks the its own representation of the prior global state. So when you remove a declared resource the diff against the prior state is interpreted as a delete. Note this does introduce the problem of "drift" when you have resources that are not captured in the scope of the state.

> i.e. do you want to put `linux-kernel` in your package list?

Yes. At least I want to put something like "core-packages" or "default" or similar as part of setting my explicit intent.

17 days ago

matrss

Yes, this is debating state management. For full declarativity some form of state management for the parts of the system that should be under declarative control (like terraform) or a stateless but very holistic view of the system (like NixOS, I guess also Guix System) are needed.

Given that ansible has neither it can't be much better then what it is. I disagree that that is the right choice though. As it is I see not much more value in ansible than in some sort of SSH over xargs contraption combined with a list of servers. The guarantees they give are the same.

> Do you know every package that should be installed because it comes out of the box in the distro? Do you want to [...]?

No, I don't want to. Thankfully, with NixOS I don't need to, since the pre-installed packages are automatically part of the declared state of my NixOS systems (i.e. I declare the wanted state in the same way in which the defaults are also declared, which makes it easy to merge both).

17 days ago

gtirloni

> Ansible playbooks are usually a list of steps to execute in order (imperative)

You can't be declarative all the way down because reality is not declarative.

You can have all modules being declarative but if you need orchestration, it's not declarative anymore unless you create a new abstraction on top of it.

So people keep arguing about declarative vs imperative and fail to specify at which abstraction level they want things to be either.

17 days ago

matrss

I agree with you, your declarative abstraction has to have an imperative implementation underneath that will do all the dirty work. Ansible presents this declarative interface at the module level (if the module is implemented properly, most aren't), and a playbook is an imperative list of declarations to be applied. Roles also combine a list of imperative steps into a declarative interface.

Since apparently (I try to avoid ansible, so I might be missing something) playbooks are the go-to approach of using ansible this means that most uses of ansible are imperative (in the context of configuring a system), unless you only ever give a system a singular role and then you are probably defining your role in imperative steps.

A system like NixOS on the other hand presents the entirety of a system configuration in a single declarative interface that is applied in one go, while applying such a configuration to a system can be a thought of as an imperative step (although it is usually a singular, unconditional step). So it is declarative at a higher abstraction level.

17 days ago

michaelmior

I didn't intend to suggest that the declarative approach is the only way to achieve idempotency.

15 days ago

mmh0000

It’s the cattle not pets mindset. In most organizations the sysadmin team is really undersized. Not uncommon to have one admin per several hundred systems. In such places, there is no time to care for individual servers. If a server is misbehaving we blow it away and spin up a clean replacement.

Declarative scripts make it easy to manage a fleet.

17 days ago

OJFord

I think that's.. perhaps not orthogonal, but has some orthogonal component - you could certainly have something like:

  for i in range(100):
      ip = cidrhost(subnet, i)
      if exists := get_server(ip):
          continue
      create_server(ip=ip)
and so on. I don't like it, but because it's procedural/imperative, not because it's particularly more 'petty' than the Terraform (or equivalent) would be.

For me it's more about what I'm doing, conceptually. I want a server to exist, it to have access to this S3 bucket, etc. - the logic of how to interface with the APIs to make that happen, to manage their lifecycle and check current state etc. isn't what I'm thinking about. (In Terraform terms, that belongs in the provider.) When I write the above I'm just thinking I want 100 servers, so:

  resource "cloud_server" "my_servers" {
    count = 100

    ip = cidrhost(subnet, count.index)
    # and so on
  }
comes much more naturally.
17 days ago

linuxdude314

This fails completely even at small scale when the script is interrupted before finishing.

The difference between just using some Python vs Terraform is idempotency. TF isn’t going to touch the nodes the script succeeded on; if you have to start your for-loop script it will, which may not be desirable.

Frankly these days configuration management is a bit dated…

You’re much better off in most cases using a tool like Packer with whatever system you want to bake an image, then use a simple user-data script for customization.

It’s very hard to scale continuous config management to thousands of servers.

17 days ago

OJFord

Eh, any way you do it could leave it in an unfinished state if interrupted, I'm not too bothered about that. (But it does sound like you think I was speaking in favour of doing it in a procedural python script sort of way? I was not.)

Packer and Terraform do different jobs (they're both by Hashicorp!) - you can bake an immutable image all you like, you still need to get a server, put the image on it, give it that S3 bucket it needs, IAM, etc.

17 days ago

jacobr1

They work together to produce immutable cattle. The alternative is managing a pool of servers where you are doing things like in-place patch upgrades, vs a teardown of the old infra and replacing it with the newly baked servers.

17 days ago

OJFord

I'm well aware, I just don't see what 'use Packer' has to do with choice of programming paradigm for Terraform or other tool in that role.

17 days ago

hathawsh

Have you been introduced to functional programming? It's excellent and mind-bending at first. Here's an overview: https://github.com/readme/guides/functional-programming-basi...

Declarative structure is at the heart of functional programming. Declarative is not the right choice everywhere, but when it makes sense, it can significantly raise the quality of the code.

17 days ago

emmelaich

and yet declarative can be somewhat unhelpful in the face of mutable filesystem

16 days ago

Fizzadar

Conceptually I think it’s much nicer to define the state of the system rather than the steps to get there, and tool of choice figures it out.

But there’s always edge cases and situations that doesn’t work which is why pyinfra supports both and they can combine any way you like.

17 days ago

jacobr1

Because what most people want is actually something closer to "Goal Seeking." If the system works as intended (and as you point out with the need to debug is often does not!) then defining the desired end-state and letting the system figure out how to get there, is a simpler, higher order abstraction. And it can also often be clearer to just say "ensure these prerequisites are met" such that alternative implementations can achieve the same outcome. In practice, abstractions are leaky.

17 days ago

dboreham

It makes people feel that they're smarter than you. See also functional programming. That said sometimes it's useful as s way to auto generate imperative actions.

17 days ago

OJFord

I really can't see how you could feel that way about it after spending even just a few minutes (which it sounds like you have) to understand what it means beyond just reacting to terminology, something having a name.

I definitely think it'd be easier to explain a python-like declarative language to someone who asks what programming is than actual python. 'It's just describing the way things should be' vs. 'it's like a series of instructions for how to compute ...'

Certainly not more clever IMO, if anything the opposite. Like I said above or elsewhere in this thread, when I'm managing infrastructure with Terraform I don't want to (and don't have to) be thinking about how to interface with the API, check whether things exist already, their current state, how to move from that to what I want, etc. I just know the way I want it being, I declare that, and the procedure for figuring it out and making it so is the provider's job. That's not smarter! The smart's in the provider! (But ok if you're going to make me flex, I've written and contributed to providers too... But that's Go; not declarative.)

17 days ago

pid-1

> Obviously python isn't

Obviously you can create declarative idioms in Python

17 days ago

OJFord

Sure, if you keep reading I described one that it looked like this might be doing.

17 days ago

mrled

Oh man this is really cool. I have also written a Python infrastructure-as-code project (https://pages.micahrl.com/progfiguration/), I really like the idea of using a programming language rather than a text document to define infrastructure. Yours looks very polished, and the built in support for testing in Docker is a brilliant idea.

17 days ago

InitEnabler

dang this exploded. I came across the project this morning when I was looking at a blog on how to implement a generic programming language to become a configuration language and it mentioned pyinfra. Glad this project is getting some exposure. :)

17 days ago

orochimaaru

What’s the difference between pyinfra and fabric? Fabric seems to have overlaps especially for agentless execution.

16 days ago

rbut

How does it compare to Fabric? At first glance it looks quite similar. All our scripts are written in Fabric, but Fabric appears to be somewhat abandoned and the latest version never reached full parity with v1. I'd be looking to try something new next time.

17 days ago

esafak

How does it compare with pulumi?

17 days ago

mdaniel

Almost exactly as it compares to terraform, since both TF and Pulumi only get down into the shell of any provsioned virtual machine via "connect and run some shell, good luck". I'd guess it would also be horrifically painful to even do that in circumstances such as Auto Scaling Groups, where even TF and Pulumi don't know the actual IP or InstanceIds

The way TF and Pulumi traditionally think about this problem would be to use cloud-init/ignition/Cloudformation Hooks to cause the machine to execute scripts upon itself. Ansible also has an approach do that via "ansible-pull" which one would use in a circumstance where the machine has no sshd nor SSM agent upon it but you still want some complex configuration management applied post-boot (or, actually even if they do have sshd/ssm but there are literally a hundred of them, since the machines doing the same operation to themselves is going to be much less error prone than trying to connect to each one of them and executing the same operations, regardless of the concurrency of any such CM tool)

17 days ago

gnosek

[yet another reference to Ansible, sorry! :)]

This looks like infinity times better than Ansible in some cases and somewhat worse in others (python.call every time I'd need to access a previous operation's result feels clunky, though I certainly understand why it works that way).

Do you think it would be possible to use Ansible modules as pyinfra operations? As in, for example:

  - name: install foo
    apt:
      pkg: foo
      state: present
could be available as:

  from pyinfra import ansible

  ansible(name='install foo').apt(pkg='foo', state='present')

where the `ansible` function itself would know nothing about apt, just forward everything to the Ansible module.

Note 1: I know pyinfra has a way to interface with apt, this is just an example :) Note 2: It's just my curiosity, my sysadmin days are long gone now.

17 days ago

Fizzadar

Definitely possible! Not familiar with the ansible Python API so partially guessing but the pyinfra op could yield a callback function that then calls ansible at execution time.

Alternatively you could just yield ansible cli and execute from the local machine using the @local connector.

17 days ago

mdaniel

FWIW, ansible modules (all of them, to the best of my knowledge) operate via a stdin/stdout contract since that's the one universal api for "do this thing over (ssh|docker|ssm|local)". That's also why it supports writing plugins in any language (shell, compiled, python, etc) since `subprocess.Popen().communicate(b'{"do_awesome":true}')` works great

DISCOVERING the available ansible actions is the JFC since, like all good things python, it depends on what's currently on the PYTHONPATH and what makes writing or using any such language-server some onoz

And this wasn't what you asked, but ansible has a dedicated library for exec, since the normal `ansible` and `ansible-playbook` CLIs are really, really oriented toward interactive use: https://github.com/ansible/ansible-runner#readme

17 days ago

linsomniac

What is different in v3? Didn't see it in the "Next" docs.

17 days ago

js2

Mostly this from the 3.x changelog:

> pyinfra now executes operations at runtime, rather than pre-generating commands. Although the change isn't noticeable this fixes an entire class of bugs and confusion. See the limitations section in the v2 docs. All of those issues are now a thing of the past.

https://github.com/pyinfra-dev/pyinfra/blob/3.x/CHANGELOG.md

17 days ago

activatedgeek

I current use Ansible to setup both local and remote hosts. I've been very happy with it, and love that Pyinfra intends to support the Ansible connector.

My main gripe with Ansible is the YAML specification. Ansible chooses to separate the task specification and task execution. Pyinfra chooses to directly expose the Python layer, instead of using slightly ugly magic functions/variables. I like this approach more since it allows standard Pythonic control flow instead of using a new (arguably ugly and more hassle to maintain) grammar.

Excited for Pyinfra!

18 days ago

WesolyKubeczek

I'm only using Ansible because of its extensive documentation and mindshare, but my best successes with it were when I let go of the idea that the playbooks specify state "declaratively". I now treat them as imperative steps where each step is being checked as to whether it needs to be done or not, and it has vastly simplified my mental model of what Ansible is actually doing.

18 days ago

letmeinhere

I think of ansible as a declarative-imperative lasagna, where each playbook is a desired state, achieved by an imperative sequence of plays, which themselves are desired states, achieved by a sequence of roles, which have the same properties, and then tasks too below that, finally resolving to plain old imperative Python.

It's all pretty messy but useful.

18 days ago

WesolyKubeczek

I never grokked this “plays” and “roles” business. All in all, this clever and cute terminology gives me creeps. I only use “playbooks” as series of tasks, more or less.

Maybe I need an explanation “like I’m just a programmer/sysadmin and I need to use boring terms years old” of what is what, every explanation so far (when I bothered to look for it last) was too invested in this theatrical terminology, so I gave up and stuck to what worked after a command or two.

Same with Chef and its galore of cooking words, but thankfully I don’t have to use Chef.

18 days ago

zanecodes

To this day I'm miffed that Chef has "cookbooks" which contain "recipes," which contain... "resources." Why not "ingredients??" It was right there!

17 days ago

jimkoen

> Maybe I need an explanation “like I’m just a programmer/sysadmin and I need to use boring terms years old”

The issue is, Ansible was written for sysadmins who aren't programmers. There is no good explanation, other than it's a historically grown, syntactic and semantic mess that should've been barebones python from the get go.

It is not idempotent. For example, how can I revert a task/play when it fails, so that my infra doesn't end up in an unknown state? How do I deal with inevitable side effects that come from setting up infra?

People will now refer you to Terraform but that is imo a cop out from tool developers that would much rather sell you expensive solutions to get error handling in Ansible (namely RedHat's Ansible Automation platform) than have it be part of a language.

But to give you a proper explanation: Plays define arrays of tasks, tasks being calls to small python code snippets called modules (such as ansible.builtin.file or ansible.builtin.copy). To facilitate reuse, common "flows" (beware, flow is not part of their terminology) of tasks are encapsulated into reusable roles, although reusability depends on the skill of the author of the role.

17 days ago

ornornor

Ansible is useful but so confusin (to me anyway).

The way I see roles vs playbooks is whether I’m going to reuse it or not.

Roles are more generic playbooks in a sense that I can share with others or across deployments (for example setup a reverse proxy, or install a piece of software with sane, overridable defaults.

I can then use roles within playbooks to tweak the piece of software’s configuration. If it’s a one-off confit/setup then I’ll use a playbook.

I don’t know if it’s the right paradigm (I don’t think it’s explained well and clearly in the docs), but using this rule of thumb has helped me deal with it.

Of course, any role can be a playbook and vice versa since they do the same thing functionally, it’s all about reusability and sharing.

Kinda how you have libraries in software: role = library, playbook = the software you actually want to write.

17 days ago

rmetzler

An Ansible playbook is usually the main entrypoint, it consists of a list of plays. Each play has hosts and a list of tasks to run at them.

Because people wanted to reuse their code, the abstraction of roles was created. A role is something like „setup basic OS stuff“, „create this list of users“, „setup a Webserver“, „setup a database“. The association, which roles to run on which machine still happens in the playbook.

18 days ago

WesolyKubeczek

I'm using include_tasks: and import_playbook:, like an animal :)

17 days ago

trallnag

You can't share a set of tasks on Ansible Galaxy without wrapping it in a role

17 days ago

polski-g

My biggest problem with Ansible is the YAML, doing anything with loops is horrendous & trying to mangle nested variable types requires a StackOverflow post every time.

A few years ago, I found a library that lets you utilize Ansible's tasks in raw Python, without the huge hassle of using the Ansible Python API. I cannot find it again however. But PyInfra looks great.

18 days ago

Fizzadar

This alone is the entire reason I started working on pyinfra, loops in YAML is just evil.

18 days ago

bruh2

Why did you choose to roll your own modules rather than do what's described in the comment you replied to, i.e. provide a Python layer for interacting with the rich set of available Ansible modules?

Not trying to be rude ofc, I'm sure you considered it and have a good reason – just curious as of what it is. An incredible project you put there, nonetheless:)

18 days ago

Fizzadar

Not rude at all :) When I first started (not sure if this is still the case?) Ansible would push Python code to the target machine and execute there, meaning it wasn’t actually agentless. I always thought of pyinfra as copying what a human would do if configuring a server by hand over SSH, so new modules that use only shell commands were needed.

17 days ago

natebc

I recall the Ansible Python API to be labeled as Interal Use Only and subject to change on a whim because of that. That at least discouraged using ansible in that way.

Seems they still kinda discourage it but do have examples at least.

https://docs.ansible.com/ansible/latest/dev_guide/developing...

17 days ago

geerlingguy

It could be interesting if you could write a translator to use any Ansible module with this, and vice versa.

17 days ago

movedx

But you can just write a small module in Python, have it do the looping logic for you, install it at the root of your project's configuration-as-code repository, and then use the module in the YAML, removing the need to do complex, ugly loops in YAML.

Is there a reason this isn't an option for you?

17 days ago

nijave

Real Python instead of templating (Jinja in YAML) would be nice.

In Ansible, it's fairly arduous to try to reshape data from command outputs into structures that can be used in loops in other tasks--especially if you want to merge output from multiple commands. Main usecase is more dynamic playbooks where you combine state from multiple systems to create a new piece of infrastructure.

I think templating yaml or templates inside yaml is a bit of an anti pattern.

17 days ago

dang

Related:

Pyinfra automates infrastructure super fast at scale - https://news.ycombinator.com/item?id=33286972 - Oct 2022 (37 comments)

Show HN: pyinfra v2 - https://news.ycombinator.com/item?id=30999030 - April 2022 (2 comments)

Pyinfra v2.0 Released - https://news.ycombinator.com/item?id=30973976 - April 2022 (3 comments)

Show HN: Pyinfra v1.4 - https://news.ycombinator.com/item?id=26983266 - April 2021 (3 comments)

Pyinfra – automate infrastructure super fast at scale - https://news.ycombinator.com/item?id=23487178 - June 2020 (64 comments)

Pyinfra v0.3 - https://news.ycombinator.com/item?id=13862942 - March 2017 (1 comment)

Pyinfra v0.2 - https://news.ycombinator.com/item?id=12956784 - Nov 2016 (2 comments)

17 days ago

jdoss

I just started using Pyinfra to wrangle a bunch of servers and it is a breath of fresh air compared to Ansible. I moved all of my server OS installs to Fedora CoreOS which doesn't ship with Python in the OS and since Pyinfra doesn't need Python on the host node I can kick off tasks in bulk to do server things. It is great. I cannot wait to see where the Pyinfra project goes.

On a side note, one of the most hacky things I came up with to get Ansible working on Fedora CoreOS was to bind mount a container rootfs that had python 3 and then symlink it into the right spots. You can of course add Python in with rpm-ostree if you want but I wanted to avoid layering packages at the time. I wasn't proud of it. But it worked.

https://github.com/forem/selfhost/blob/main/playbooks/templa...

18 days ago

shoggouth

Doesn’t IBM/Red Hat own Ansible and Fedora CoreOS? I would think they would mix together perfectly.

17 days ago

movedx

> since Pyinfra doesn't need Python on the host node I can kick off tasks in bulk to do server things.

And you can do this with Ansible, too. Check out the raw module/command.

17 days ago

jdoss

I am aware of the raw module. The stuff I was doing with Ansible and Fedora CoreOS required more than just that module.

17 days ago

movedx

Couldn't you use the raw module to get Python into place and then use the rest of Ansible's feature set after that?

16 days ago

zbentley

I think Puppet hits the sweet spot in this area. It's default is a series of idempotent "here's how this should be configured" statements, but it can be used as a full programming language in its more advanced capacity, and it's reasonably extensible (in Puppet-lang and Ruby) to support specific custom applications.

I also think that the facts/manifest/apply separation is conducive to nicely testable infra code, and useful dry-run output.

I'm always surprised that Puppet isn't still more popular. My theory is that it's passed over because of its age/cruftiness/bad vibes in some cases, and that a couple of technical flaws mess it up for some key userbases:

For folks who just want a quick-to-start management tool for a small set of config, Puppet's ugly and clunky client/server model and the hyper-YAML-ification of its best practices (which is pursued to a fault by the community, and not helped by the Hiera pitch that the Puppet stack can also be sort of an asset tracking/catalog system) make small-scale usage and prototyping hard. Puppet doesn't have to be used that way (it can be used just like pyinfra/Ansible with a local-apply or via Bolt, hitting a nice sweet spot between ad-hoc/non-idempotent commands and nice declarative/idempotent Puppet code), but I think the puppetmaster/hiera-all-the-things legacy in the community does Puppet and potential new users a disservice.

From the other side, I think a lot of more cloud-oriented users looking for a "better Terraform for server state" end up annoyed by the quality of modules on the Puppet forge and Puppet's lack of a statefile equivalent (meaning that it doesn't support deletes or infrastructure state snapshots in the same way TF does).

18 days ago

turtlebits

Adding config management agents to run on your infra is IMO unnecessary operational burden. (ie puppet, chef, saltstack, etc.) In the day and age of everything running on Docker, the closer you are to a bare OS image, the better.

Config management that uses SSH is generally good enough.

18 days ago

zbentley

I agree; that's the "client-server legacy" that I mentioned in GP.

It's unfortunately not widely known that Puppet can be run just like you describe, over SSH (or, for e.g. running in a Docker container, can be invoked as a one-shot "puppet apply" against a local configuration file like pyinfra's "local" transport): https://www.puppet.com/docs/bolt/latest/bolt.html. Doing that requires no background daemons, puppetmasters, cert-signing hell, inventory management PuppetDB/Foreman stacks, or any of that stuff: you run a command which SSHes to a remote/local machine and applies changes based on instructions written in Puppet-lang or one-off scripts. The remote end is entirely self-hosting; it doesn't rely on anything being running on the remote host (Bolt will install the "puppet-agent" package to bootstrap itself, but in this context that package is inert and is used equivalently to a library when you run tasks).

I'm with you that the agent-based approach is far from the best way to go these days. I'm just bummed that we're throwing the baby out with the bathwater: I wish Puppet-the-language and Puppet-the-server-management-tool weren't so often dismissed along with the Puppet-as-inventory-system or Puppet-as-daemonized-continuous-compliance-engine.

18 days ago

bigstrat2003

Hard disagree. Having an agent running on things is IMO far superior for preventing config drift (agents checking in versus one big centralized cron job pushing state to everything). And to be honest, the fact that it doesn't play as well with Docker is a flaw with the idea of putting everything in Docker, not having a config agent. Some things work well in containers, but it's silly to try to shoehorn everything into them the way many people do.

18 days ago

eurekin

Can concur, used puppet a bit at the dayjob and agent issues were common at some point.

Also, for bigger inventories on a single vm runtimes shot up quickly in the hour realm

18 days ago

zbentley

Yeah, dealing with agent issues sucked; I'm glad I haven't worked on one of those setups in awhile. And if the agent bootstrapped some part of the shell-in-and-remotely-troubleshoot tooling, good luck debugging it, and if the agent bootstrapped the telemetry system, good luck telling the difference between "host with agent failure" and "host that disappeared"... anyway. Fun times.

For hour+ runtimes I really do think that's pretty much always user error. I know that's a clichéd and grouchy comment, but (as, I'll admit, a Puppet fan with some personal defensiveness for a favored tool) I do think it's true in this case.

18 days ago

esoterae

Ssh and its child processes are just another agent. Agents of a model that must be up at time-of-convergence as seen from the coordinator node; a remarkably inflexible arrangement that can only be addressed with additional development not otherwise necessary.

Ruby is far, far preferable to shell for ease of idempotence and implicit convergence.

18 days ago

aprdm

I believe it's due to Ruby being its language of choice. Ruby is mostly a dead language in the Ops space, unlike Python.

Having inherited a big mess on Puppet of some people who used the flexibility of Ruby to automate 5 datacenters, but then left the company was also an interesting experience..

17 days ago

slyall

Another reason for puppet being less popular is lots of places ended up with very complicated configuration that did everything on the server but was hard to work with.

Ansible you could deploy a small playbook that did just one thing. A lot easier to get started with and keep under control.

As others have mentioned puppet was also a lot less useful when server images can per-configured and often short lived. It was more designed to take a bare OS install and turn into a long-lived server.

17 days ago

bityard

I was an early adopter of Puppet back when it was fairly new. It was a breath of fresh air when the state of the art was cfengine!

Despite its many great ideas, I never particularly liked the agent or need for a master server. And I've always managed to avoid learning Ruby so I couldn't easily hack on it myself. The company I'm with now uses it extensively so I'm having to re-learn it and so far my impression is that it went from "cool new open source thing" to "your average enterprise-grade bloatware thing".

17 days ago

nijave

Usually when I use these types of tools I'm building immutable infrastructure where a golden image gets built and an existing app data volume gets attached to a new OS image (same workflow as Docker containers but more access to kernel/hardware)

Puppet doesn't work well for that. I've seen it come up in auditing scenarios since the agent can effectively report if the instance is still in the correct config state.

17 days ago

pants2

This will tie nicely into my favorite way to deploy services these days:

1. Use PyInfra to set up Docker and Tailscale on remote hosts and any other setup. Open the Docker port to your Tailnet.

2. Use the Docker provider for Terraform to set up and manage containers on those hosts from your dev machine or from a CI/CD tool. Tailscale allows containers on different machines to communicate privately, or you can open a port to the web.

This makes for such an easy-to-use and bulletproof setup. In the past I would have used Kubernetes but I've come to realize that's overkill for anything I do and way harder to debug.

17 days ago

nijave

This kind of setup is a nice improvement over golden images with a lot of the benefits. Application setup, upgrades, and rollback become much easier when the whole app is packaged together and has its own copy of dependencies.

You can also throw in systemd units for Docker or Podman. I usually create a small shell script that pulls, removes any old container, then runs a new container with correct args in the foreground and toss that in a simple systemd unit

17 days ago

asselinpaul

is there a blog post or github repo with more info on how you do this?

17 days ago

pants2

No but I'll think about writing one up!

17 days ago

udev4096

Why not go for headscale?

17 days ago

photonthug

Agree with those saying the landing page needs work. But terraform/docker integration sounds interesting.. after many years of ansible you’d think there is a more comfortable way to replace a hundred lines of hacky bash in dockerfiles.

Also, can I just say that cm is extremely frustrating? Not sure this is the fix, but hopefully the story isn’t over. In my experience the maintenance of cm codebases never, ever stops. At first I thought it was a matter of expertise, but experts typically agree and just call it the cost of doing business.

Shelve something for three months and it will break on the next run, on the same os/host where it used to work. Blame the package manager, blame the os choice, or the cm tool. But it’s embarrassing and insulting for Devops teams after putting in the effort to do things right, and evangelizing to everyone else about repeatability. I’d rather just see tighter integrations with containers moving forward and never think about it again. Not everyone is using k8s but in the 2020s everyone probably should default to using docker before doing things of even marginal complexity directly on hosts.

18 days ago

Fizzadar

> Agree with those saying the landing page needs work.

Any & all feedback much appreciated! It's basically just a very rough copy of the README at the moment.

18 days ago

kureikain

Thanks for making Pyinfra.

It's one of the tool that get out of your way and let. you get the work done. The tool works for you instead of you fighting with the tool.

Pain point of ansible: storing state and checking later, coordinate state between server is all a breezy with Pyinfra because you write the Python code to perform those check.

The system is very well though out. No need to hack around host file, inventory is just a python script that export resource definition.

No more static, ad-hoc host var, you get a real python script to define and return your variable.

Using pyinfra I was able to focus more on the "compute". the state such as credential, inventory can managed and store outside such as in SSM or just call python ec2 api to filter instance by tag.

17 days ago

wg0

Ansible needs a working Python interpreter on the target machine.

Pyifra doesn't even need that. Just needs a shell.

Subjective opinion but it is heavily under recognized piece of software.

Ansible is really great but you soon end up writing Python in Yaml strings.

So why not straight up Python?

17 days ago

mdaniel

As an FYI, "needs" is not correct, it has `raw:` for doing anything the target interpreter understands (sh, bash, powershell, etc), which can then include actually provisioning beefier interpreters (full blown python, pypy, whatever)

Ansible plugins can be written in any language, shell, compiled binaries, whatever, and communicate with the control plane via stdin/stdout

I suspect you are thinking of Jinja2 when you are writing python in yaml strings, which ... kind of, I guess, but also confusingly not Python, or at least the hacked up copy of Jinja2 that ansible uses can't do all the fun things normal Jinja2 can

17 days ago

Lucasoato

Should this be considered some kind of alternative to tools like Ansible?

Also CDKTF should be in the space for imperative infrastructure as code definitions.

- https://developer.hashicorp.com/terraform/cdktf

18 days ago

slt2021

cdktf is fantastic

18 days ago

Lucasoato

What languages are you using it with? Last time I tried with python the code was super verbose, type hinting suggestions was not happening, both vs code and pycharm… can it be linked by the fact it’s transpiled from typescript?

17 days ago

rajaravivarma_r

This is great. We tried ansible and gave up as it was difficult to keep configuration DRY and annoying to create conditions with no control structure.

It was before ansible 2, so probably things are better now.

Then we started using Python fabric. Wow it was so freeing. Any helper methods were easily extracted and writing conditions felt natural.

Now I am using Python invoke to maintain my local setup.

17 days ago

bityard

I gave up being religiously DRY in Ansible playbooks early on. It's much easier to open a file and read through a list of simple 2- or 3-line tasks that execute sequentially, than it is to chase down a bunch of imports.

Same as in programming, over-adherence to DRY leads to spaghetti code.

17 days ago

rajaravivarma_r

I have tried this path of not trying to DRY everything, but has regretted and refactored to a more DRY approach eventually. The cost of remembering to fix/alter the logic everywhere is more than trying to keep it DRY. More often a method or a module is enough, nothing fancy.

The only place where I have accepted that DRY is not worth it is, unit tests. I used to extract any common behavior in a shared test, but each object will eventually evolve in its own way that the effort to make it DRY will be useless.

17 days ago

Izmaki

Ansible is strong when done right. Check out the tutorial series by Jeff Geerling on YouTube, he's amazing.

17 days ago

rajaravivarma_r

May be, but moving to Python did not take anything away. It brought more joy that you have more control over things like, on which server to run the migration and choose UAT or prod and just a list of servers specified in the command line.

And organizing the modules was straight forward as we already knew/did that in the project.

Perhaps, it comes from my programming background, but its true.

17 days ago

mhh__

I worry about using python for this kind of thing.

It's very hard to be confident about python code.

If you have a good code review feedback loop and so on then it can be OK but proper types enable lots of good things when dealing with configuration and state.

17 days ago

Spivak

I mean Python has your back with static type hints. While Python's type system isn't the most powerful in terms of expressiveness -- TypeScript is stronger, Go is weaker, it's more than capable enough for a config management system.

17 days ago

mhh__

Emphasis on hints.

And my point is that it can be way too capable.

17 days ago

Spivak

I guess the fact that they're hints doesn't really bother me when you're doing static analysis. You can have strong typing with a weak type system like C and Go where the types will be rigidly enforced but they're also not expressive. There end up being lots of things you can't express in the type system which leads you to do things like void* or `any` with manual casting.

But a fully type-hinted Python codebase is extremely expressive, the times where you have to opt-out of the type system is much much rarer and the types you end up writing are much more specific so you get stronger guarantees. It's not without downsides but I don't think it's "because they're hints you can't trust them" since lots of languages erase their types on compilation.

17 days ago

exceptione

I am not elbow deep into Python ecosystem, but how many python code bases are fully type-hinted?

Maybe I am overlooking because I am not a pythonista, but when looking at this code [1] I see only some superficial hints. Looking at `_make_command`, I need to look inside the body to see that the first argument is expected (?) to be callable (it just ignores otherwise).

____

1. https://github.com/pyinfra-dev/pyinfra/blob/3.x/pyinfra/api/...

17 days ago

belthesar

It's definitely still in the minority, but you're seeing a _lot_ more newer projects adopting consistent, ubiquitous type hints.

To your point though, the `_make_command` method here is not setting hints in its arguments. I'm not super familiar if this is considered "fine" in a pydantic world, as I found for my usage, native python type hints were more than fine to make my code more usable and safer. Based on the code though, it seems like there are cases where the `command_attribute` is not a callable. What I don't understand is why this isn't hinted as a Union of Callable and whatever other types it could receive. I'd have to spend more than 3 minutes looking at the code base to understand how it's used to get a stronger idea here.

16 days ago

mattbillenstein

Was there any thought to perhaps do a version with an agent? I really like how fast Saltstack can be as compared to Ansible.

I've been using my own homegrown project that does just this - Python roles, server/client, Mako templates: https://github.com/mattbillenstein/salty

It's very very fast to do deploys on long-lived infrastructure, but it hasn't been optimized for large clusters yet; I expect the server process will be a bottleneck with many clients, but still probably faster than Ansible for most setups.

17 days ago

Fizzadar

pyinfra supports executing on the local machine (@local connector: https://docs.pyinfra.com/en/2.x/connectors/local.html). If you store the operation files on the machine that’s basically an agent when executed just without a periodic check for other changes. Adding a mode to do that in a loop would be pretty trivial..

17 days ago

mattbillenstein

Yeah, I'm talking more about RPC - the server sends a command to the agent - the agent does a thing and returns a response. There's no external sync of the command and given a long-lived connection - client/server what you will - this can all be completed in milliseconds with no new-connection overhead.

17 days ago

mleonhard

Very cool. One question: Can Pyinfra create container-like objects and objects inside them? Example: create an RDS database, create a user inside that database, and assign the user a role.

Terraform cannot deploy such a configuration in a single config, since its planning stage requires that all containers already exist. Terraform crashes when planning the user and role changes, saying that the database doesn't exist. This is a large pain-point when using Terraform. How does Pyinfra handle such deployments?

16 days ago

posix_monad

Python seems like a really poor choice for infrastructure.

- Python is not easy to build into portable binaries

- The package ecosystem is very hard to use in a reproducible way

- The language is not truly typed - types add massive value for infrastructure and scripts because they are less likely to be unit-tested

- The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)

Maybe I'm missing something? I don't know why I would want to introduce Python in this domain.

18 days ago

DandyDev

Why is it important to be able to build into portable binaries? Pyinfra doesn't require running Python on the machines you manage. Pyinfra basically turns your Python code into shell commands which it runs over SSH. So only your development machine has to run Python.

I think there is not a lot of overlap between people who need to automate infrastructure and people who don't know how to install Python on their development machine.

As for your other comments regarding Python as a language: I mostly agree. I have stepped away from Python as a language to develop production software. In Python I miss the confidence I get from static typing. Having said that, for automating infrastructure, you're effectively comparing Pyinfra and Python to bash scripts and YAML (for things like Ansible), which are both orders of magnitude worse if you like static typing or any form of being able to verify what you wrote.

17 days ago

yjftsjthsd-h

N=1 I'm capable of handling python on my dev/deploy box(es), but that doesn't mean it's not a pain. In my perfect world, ansible/puppet/chef/whatever would ship as a single static binary even when they mostly ran against remote SSH targets.

17 days ago

BirAdam

Right, but this makes me wonder why I can’t just do: ssh user@host “echo ‘Hello World’”

All these kinds of tools essentially just executing commands over SSH… I could just SSH.

17 days ago

fire_lake

> So only your development machine has to run Python.

If you have a team of developers and a CI process, then portability is important. There isn’t one development machine.

17 days ago

Fizzadar

Extremely aware of this (see pyinstaller attempt): https://github.com/pyinfra-dev/pyinfra/pull/768)

I chose Python because it’s what I was writing all day back in 2015. Which makes me realise pyinfra is almost 10!

Edit: I mostly write Go or YAML (k8s) these days but Python still makes an appearance from time to time (outside of pyinfra dev).

17 days ago

goodpoint

Python is an excellent choice.

> - Python is not easy to build into portable binaries > - The package ecosystem is very hard to use in a reproducible way

People use OS packages since 4 decades.

> - The language is not truly typed

The language IS strictly typed.

> - types add massive value for infrastructure and scripts because they are less likely to be unit-tested

99% of errors in deployment are not solved by typing.

> - The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)

If your logic is so complex that let/var makes a difference you should be not touching infra.

17 days ago

kortex

This might have been true a few years ago, but these are all solved problems in 2024.

> Python is not easy to build into portable binaries

https://pex.readthedocs.io/en/v2.1.40/buildingpex.html

- The package ecosystem is very hard to use in a reproducible way

pip, virtualenv, and requirements.in/txt is extremely reproducible. I will offer that it's not exactly idiot-proof yet and there are tons of stale tutorials out there

> The language is not truly typed - types add massive value for infrastructure and scripts because they are less likely to be unit-tested

Yes it is, if you want it to be. There's nothing stopping someone from using mypy, pyright, or other type tool on the strictest setting, and not passing builds unless you have 100% type coverage.

> The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)

No, but you get ~95% of the safety guarantees by using immutable-esque objects like @dataclass(frozen=True), pydantic models with the same, or attrs/cattrs with similar setting.

16 days ago

mixmastamyk

Sounds like you'ved misjudged the use case. Tools like this do the deployment, they aren't generally deployed themselves.

So a portable binary is not a requirement. Other points like let or types are not an impediment either, there are many quality tools available if you need them (ruff, pyflakes, mypy), and python has been doing this kind of work productively for thirty years now.

17 days ago

posix_monad

> Tools like this do the deployment, they aren't generally deployed themselves.

It will have to be executed on many different developer machines (or even your own machine several years in the future) so a simple, reproducible build process, including fetching pip dependencies, is critical.

17 days ago

mixmastamyk

Presumably the tool keeps backward compatibility, as did ansible or salt, so this doesn’t seem to be a real-world concern. Very few folks are doing nix-level stuff, yet the world marches on.

There will likely be a security fix in it or a dep at a later point, so you wouldn’t want to use the exact same version anyway.

17 days ago

nijave

A decent Python development tool chain handles most of that. Docker, pylint, black, type hints integrated with IDE/editor

Admittedly some languages like Go do a better job integrating all this into the core of the language. However, Go doesn't tend to have as powerful of a stdlib so it tends to be a lot more verbose to achieve the same thing.

17 days ago

fire_lake

Docker should not be required for day-to-day development.

17 days ago

nijave

Why?

17 days ago

fire_lake

It’s hard to install on some setups. It’s much slower than native. It’s admitting defeat.

15 days ago

aprdm

Maybe because Python is already in use by pretty much every company that makes money in this (and others) domain ? Some of what you mention looks like pebkac problems as well.

17 days ago

mhh__

Well so is pretty much any configuration language under the sun, and all the other options that aren't python.

17 days ago

exceptione

I 100% agree with your points.

No type checking = no serious job. I have learned enough from Ansible to not ever touch that kind of stuff again.

There has been a time that Python was a fringe language, only known by some hardcore nerds. I thought Joel Spolsky had once mentioned that having Python on your resume was a signal of a quality developer, someone who went off the beaten path.

Times have changed. Python is now the MS Excel for developers. It shines for quick and dirty data mangling. Unfortunately, that is how a seizable portion of people approach software engineering. My theory is that for some having to do abstract thinking and perform a dry analysis beforehand is an impediment. They can only discover what they want while banging out something. They fix the runtime errors they could catch, and slap some more features on top.

Types imply a kind of foresight, and that is what some people really have difficulty with.

EDIT: Might sound negative, so I admit that the quick feedback cycle you can get from an interpreter language like php/python is a feature in itself.

17 days ago

IshKebab

I agree, Python is a pretty bad choice for all the reasons you mentioned.

That said I think there are precious few good alternatives. I've been using Deno a fair bit for "scripting" and it works pretty well, but I wish there were more options.

Also I have to say if you are using a tool like this to manage thousands of machines you're absolutely doing it wrong. I don't even work in ops/infra but even I know that manually running commands on multiple machines via SSH is asking for trouble.

17 days ago

benrutter

I think Go would be a logically choice if you're being completely language agnostic, but most teams aren't. If teams are working exclusively in python already for web or data projects, there's a benefit to not introducing a new language just for architecture deployment if that's a small part of a teams function.

17 days ago

JoBrad

What would you have used? All of your issues aside, Python is very approachable to people who are used to managing infrastructure but may not have a strong programming background.

17 days ago

dboreham

Shiv is a decent solution for making a portable package. Single file that only depends on a recent system Python being installed.

17 days ago

mountainriver

It’s better than Yaml or HCL though

17 days ago

bborud

Python is a nightmare when used for tooling. I’ve wasted so much time wrangling Python tooling for embedded development. Go would be a much better choice.

17 days ago

manojlds

Does it allow me to run a script against an EC2 instance, say, and it spins it up and take care of everything? Something like packer would but without creating an AMI

18 days ago

letmeinhere

You ever try cloud-init?

You can specify your config in user-data when launching pretty generic AMIs. https://cloudinit.readthedocs.io/en/latest/index.html

18 days ago

manojlds

I don't want to launch instances (and run a script to set it up etc), I want to run my script THROUGH an instance.

17 days ago

nijave

I made this a while back that utilizes shell script and AWS cli to spin up and cloud init to run things https://github.com/nijave/cloud-init-golden-image

For this type of use case AWS has managed services like Batch, ECS, or even auto scaling groups that can make this easier depending on what you're trying to achieve.

ECS with Fargate executors is fairly easy to run arbitrary things inside a VPC

17 days ago

Fizzadar

You'd need to create the EC2 instance outside of pyinfra (ie in Terraform). This could be done as part of the inventory itself, but wouldn't self-delete afterwards. If using Terraform there's a connector that allows you to plug Terraform output as a pyinfra inventory: https://docs.pyinfra.com/en/2.x/connectors/terraform.html

18 days ago

verdverm

fyi, in Packer, there is an option to not create the final image

18 days ago

manojlds

Yes! Which is what I am doing now, but was feeling it was using a tool that wasn't meant for that job a bit.

17 days ago

Maledictus

IMHO Ruby is better for creating DSLs, so I wrote a small thing to scratch my own itch: https://github.com/marius/koch

This is not meant to scale to more than a handful of machines, but you get the idea how nice straight Ruby is for a machine specification DSL.

17 days ago

mdaniel

https://github.com/marius/koch/blob/main/example/Rezeptfile suffers from the same problem I have with every single ruby ever: what are the available verbs I can type?

Contrast that with https://docs.pyinfra.com/en/next/examples/client_side_assets... where any sane setup will show completions after both the `from` and the `local.` typing

17 days ago

Maledictus

Thank you for the feedback!

I agree that completion would be nice to have, and probably relatively hard to implement for koch.

However, I prefer the cleanliness, dare I say beauty, of the config file and Ruby.

16 days ago

lnxg33k1

Anecdata: maybe yes, but when I was using puppet it would take 45 minutes just to load

17 days ago

godisdad

Seems like an interesting generalized mix of something like https://github.com/cloudtools/troposphere and Ansible from a glance.

The value add would be unifying provisioning and configuration management in a Python-y experience? The lifecycle of each is distinct and that's traditionally where the headaches of using a single tool for both has come in

18 days ago

voiprodrigo

Is this something that would be a good fit to automate node reboots/restarts of complex clustered systems? Think Kafka, Elasticsearch or Flink, where you can’t restart the next node without revalidating the state of the cluster and the rejoining of the previous node. Please feel free to suggest other tools for this purpose.

16 days ago

ornornor

I use pyinfra through molecule for testing sensible roles, it’s made it possible to have a process resembling TDD and have automated tests for my roles and playbooks. I actually don’t know how else to do it than with molecule and pyinfra, being able to have automated tests on ansible “code” made a big difference for me!

17 days ago

crispyambulance

Yeah, I like this approach.

There's something about YAML that just sucks the joy out of programming. It seems like a giant step backwards when we have plenty of amazing programming languages in existence.

Even when infrastructure yaml like cloudformation are wrapped by some SDK, it can still be a pain because you end up with stuff like...

    do_something("___((!-prickly_config_string_::might as well use yaml _blah-blah:blah))")
Back in the days of java and xml, there used to be a distant promise of "binding" the xml to code (remember jaxb?) so that you could then just manipulate it fluently as code and then "marshall" it out back to xml when you were done. Those days and that promise are gone, right?
18 days ago

mdaniel

https://github.com/aws/aws-cdk#at-a-glance is the "generate cloudformation using code," and is the AWS version of troposphere as best I can tell

18 days ago

crispyambulance

CDK looks like it definitely does do that!

17 days ago

riffic

this feels like Michael DeHaan's OpsMop project that existed for like a week before he pulled all the code offline.

https://news.ycombinator.com/item?id=18717422

Interesting to see all the Ansible comments here. I'll have to check this out asap.

18 days ago

BodyCulture

I would like to point to a virtual machine or a set of virtual machines that I have configured and make the tool reproduce / translate the state of these „model machines „ to some hosting environment.

Can this or any other tool do that?

17 days ago

mdaniel

aws ec2 create-image :-D

In all seriousness, I would guess this requirement has a hidden 80/20 in it because it is very unlikely that one wishes every machine to be a perfect copy of each other, unless the config files have been very, very disciplined about the hard-coded strings and assumptions made

So even in my glib "create-image" response, even then there's almost certainly going to be some cloud-init that subsequently stamps the booted instance with its actual identity

17 days ago

mixmastamyk

How often is this kind of tool needed since containers went mainstream? I had gathered they were not used as often any longer.

17 days ago

mdaniel

My experience has been that for day zero stuff, e.g. how do you get a system _prepared_ for containers, this kind of tooling is handy. I side with the sibling comment that cloud-init is The Way but it also requires (a) some trial and error (b) to think entirely in terms of cattle/pets which some folks/organizations are not there yet

17 days ago

umen

There is need for python module that complies to ansible code

17 days ago

Feathercrown

This is really cool! Kinda seems like the Nix config approach.

17 days ago

lodriv

However the email address was not being processed

17 days ago

subhro

Why does this sound so familiar to Chef?

18 days ago

[deleted]
18 days ago

beefnugs

Does anyone have any info on if saltstack is going to be enshittified? That is the situation that would get me to go looking for a replacement such as this

17 days ago

emacsen

Maybe this is the best thing ever, but the documentation doesn't answer one simple question: What problem is it solving?

It is a configuration management tool, like Ansible?

Is it meant for running one-off commands across the infrastructure, like Salt?

It says it integrates with Terraform, so it's not a provisioning tool...

What does it do different (and presumably better) than other tools?

The Getting Started guide doesn't cover this. The FAQ doesn't cover this, and the Docs doesn't have an Introductory section to cover this.

It's disheartening to find a potentially interesting project, but not really know what it does and how it might fit in your workflow.

18 days ago

fermigier

It's similar to Ansible, but uses Python as a declation language rather than YAML.

It can also run one-off commands across the infrastructure (like Tentakel: https://pypi.org/project/tentakel/ ).

I've been using Pyinfra for some time. It's good enough for me.

18 days ago

js2

I've been using it as well with great success.

A couple years ago I inherited about 100 Mac Pros that are part of $dayjob's CI infrastructure. They had been managed over the years using a combination of shell scripts, Chef, and manually via VNC. No two machines were alike. The Chef recipes had all bit-rotted and weren't usable and due to $reasons were based on an old version of Chef that $company was stuck on.

So I looked around for alternatives, and being most comfortable in Python, I explored Ansible, Salt, and Pyinfra.

Ansible seemed like the obvious choice, but it has very few playbooks/actions for macOS systems. I was going to have to write my own. As I dug into its documentation, I found it was taking me a long time to wrap my head around all that I needed to do and started to sour on its complexity. This is a matter of taste, but I just didn't find Ansible very welcoming. I wanted something simpler.

I had previously used Fabric, so considered using it again. But Fabric offers too little (it's really not much more than parallel ssh-if you want idempotent operations you have to write that yourself), and I don't agree with the direction it took with version 2.x.

Then I found Pyinfra. It took me less than 30 minutes to understand it in its entirety. It's conceptually simple: you have an inventory of machines that it connects to in parallel over ssh. You provide it with a deploy script that combines facts and operations. Pyinfra uses the deploy script to gather facts about each machine, then you use those facts to decide whether you need to perform any operations. It then performs those operations on each machine as needed. The inventory file, deploy script, facts, and operations are trivial to write for someone comfortable with Python. It's all Python with the facts and operations being decorated functions. There is no DSL to learn. (It comes with a bunch of pre-written facts and operations, but they are mostly for Linux systems. I had to mostly wrote my own for macOS, but I found them really easy to write.)

I had it operational the same day I found it. I used it to successfully get all of the Mac Pros into consistent state: things like system settings, installing Xcode, automating installs of brew packages all at the same version, installing JVMs, updating and upgrading macOS, installing Sentinel One, etc.

I've been very happy with it, even contributing a few PRs to fix small bugs and contribute minor functionality.

18 days ago

Fizzadar

I would love to see any macOS facts/operation code if you can/would be willing to share! We also managed a bunch of macs using pyinfra but mostly stuck to shell commands.

17 days ago

Fizzadar

> It is a configuration management tool, like Ansible?

Yes

> Is it meant for running one-off commands across the infrastructure, like Salt?

Also yes.

> It says it integrates with Terraform, so it's not a provisioning tool...

The TF integration is specifically to use TF as an inventory source - ie TF to create resources and pyinfra to then configure them.

> What does it do different (and presumably better) than other tools?

The homepage covers the highlights, I originally created pyinfra because debugging Ansible was complicated (no plain stderr as not "just" commands on the remote side) and slow, but things have evolved significantly since then.

> The Getting Started guide doesn't cover this. The FAQ doesn't cover this, and the Docs doesn't have an Introductory section to cover this.

Hugely appreciate this feedback, this is super helpful and something I will attempt to make clearer.

---

Quick attempt at a better explanation: You write Python code that defines operations (either state "this apt package should be installed" or stateless "run this command"), provide an inventory of targets (SSH, local machine) and pyinfra executes it.

Roughly sits where Ansible does for configuring servers, but also solves the case of "how do I run this command across my server fleet" (which I believe Ansible can also do).

18 days ago

emacsen

I hope I wasn't coming across as too negative.

I genuinely think that an introduction with a few user stories would go a long way!

17 days ago

the_duke

> and slow, but things have evolved significantly since then

Well, Ansible is still dog-slow, so that part has not evolved...

18 days ago

Fizzadar

Heh yeah this is very true, I updated the perf test repo earlier this year to confirm https://docs.pyinfra.com/en/next/performance.html

17 days ago

remram

> Great for ad-hoc command execution, service deployment, configuration management and more

I found that pretty clear to be honest.

18 days ago

dangoodmanUT

People will know whether it solves their problem when they see it, no need to akchually the OP or maintainer

18 days ago

paulddraper

It's like Ansible.

That's what I discovered by reading the homepage.

18 days ago

mdaniel

I would disagree with that since ansible actually does two things simultaneously: cloud provisioning and local provisioning (and that "local" is actually hiding a 3rd axis, actual local, not just local to the managed instance, say for example if you needed the azure libraries or such, you can use a pre_tasks: block to create a virtualenv and install the deps locally before firing up the main workload)

Reasonable people can 100% disagree about whether yaml is the correct packaging for those operations, and ansible is a bit too imperative for my liking, but as far as "I have one hammer..." it does all the things

18 days ago

paulddraper

Ansible can technically do cloud and local provisioning, just as Terraform can technically do cloud and local.

But practically, these tools have their areas of speciality.

17 days ago

mdaniel

> just as Terraform can technically do cloud and local.

I feel as though we're splitting hairs here, given there is, to the best of my knowledge, no `resource remote_file make_sshd_config { inventory_host = "whatever" dest = "/etc/sshd_config" src = "./sshd_config.tmpl" vars = {...} }` in TF. There is template, and there is local_exec and the rest is a Simple Matter Of Programming :-/

I'm waiting patiently for someone to chime in "well, just spawn ansible in local_exec" as if they're missing the point

17 days ago

activatedgeek

Any kind of provisioning doesn't seem too far a step though. It is just another "operation" with its own state management logic.

18 days ago

mdaniel

I mean, I hear you in that python is Turing complete so all things are possible through another level of indirection, but I didn't see one shred of amazon.aws.autoscaling_group anywhere in their docs so .. what, I write my own? If I was going to go through the trouble of writing custom shit for Yet Another Awesome Cloud Thingy I'd fire me

18 days ago

activatedgeek

The fact that Pyinfra does not currently support a feature which can be implemented using Pyinfra philosophy does not make it different than Ansible. I believe that was what the parent comment was about.

18 days ago

emacsen

Digging in the docs, it uses words like "inventories" and "operations" which indeed look like a configuration management system, much like Ansible, it's agent-less.

And that's cool- Ansible is a bit of an oddball system, but then I'm still left wondering, why is this better, or why it is better for the author at least?

I've used cfengine, Puppet, Chef, bcfg2 (briefly) and ansible. I want to know what makes this tool different and better. :)

18 days ago

tryauuum

Salt is not meant for running one-off commands. You can easily make sure state.apply is run for all of your infra several times an hour

17 days ago

[deleted]
18 days ago

koko-blat

it's bc it solves all problem ever exist.

18 days ago

admin1231111

[flagged]

18 days ago

betimsl

[flagged]

17 days ago

Invictus0

[flagged]

18 days ago

remram

See FAQ #2

18 days ago

cheptsov

We build a similar tool except we focus on AI workloads. Also support on-prem clusters now in addition to GPU clouds. https://github.com/dstackai/dstack

17 days ago

linuxdude314

Should probably just stick with the Terraform CDK or Chef if you need this level of expressibility.

This is no where near the level of readiness needed to be reliably used in a production environment.

Verbose logging is not a reason to introduce a non-standard tool into your stack.

17 days ago

yjftsjthsd-h

> Should probably just stick with the Terraform CDK or Chef if you need this level of expressibility.

I'm not familiar with Terraform CDK, but I don't see what Chef does/has that this doesn't?

> This is no where near the level of readiness needed to be reliably used in a production environment.

Why?

17 days ago

imiric

> This is no where near the level of readiness needed to be reliably used in a production environment.

This is baseless FUD.

Pyinfra is 8 years old, just 2 years younger than Terraform. It's well maintained, stable, and used by many teams in production. Just because it's not as widely known or adopted as other tools, doesn't mean it should be avoided. In fact, as you can see from testimonials here, users often prefer it over Ansible.

> Should probably just stick with the Terraform CDK or Chef if you need this level of expressibility.

Terraform is used for provisioning infrastructure. Pyinfra is a configuration management tool. They're not equivalent.

Chef is closer, but it's an older tool that has largely been superseded by Ansible. It shouldn't be anyone's first choice, unless they really need some obscure feature it does better than Ansible, or Puppet for that matter.

> Verbose logging is not a reason to introduce a non-standard tool into your stack.

Why would that be the only reason to use this? That's not even one of its prominent features, and surely all tools in this space support verbose logging...

What a confused comment.

17 days ago

0xbadcafebee

so, it's Ansible...?

Configuration Management tools (that's what this, and Ansible, are) are a nice idea, but get very complicated very quickly. The tools themselves get complicated, the configuration gets complicated, you're constantly finding ways that the state gets broken that you need to re-incorporate into your script, it has to work in a variety of states, and you have to keep re-running and re-running and re-running it, monitoring for problems, investigating, fixing. Very complex, lots of maintenance, lots of potential problems. The "Pets" model from the phrase "Cattle, not Pets." I strongly recommend you do not raise Pets.

Instead, use Immutable Infrastructure: build an immutable image one time that works one way. Deploy that image. If you need to change it, change the build script, build a new image (with a new version), deploy a new instance with the new image, take the old one out back and shoot it. (The "Cattle" of "Cattle, not Pets") If the state gets out of whack or there are problems, just shoot it and deploy a new one that you know works.

This is the single most revolutionary concept i've seen in over 20 years of doing this job. It is an absolute game-changer. I would not go back to Configuration Management for all the tea in China.

18 days ago

aprdm

You're conflating different things - this has nothing to do with Pet vs cattle.

Even in your confusion, State still exists in the real world and needs to live somewhere, it also is unfeasible to always recreate big states.

17 days ago

0xbadcafebee

This happens so often on HN, and it is so god damn frustrating. I'm literally a fucking expert, telling you the best thing to do, and explain why, and I get downvoted for it. The next person who tells me in a comment "explain your opinion! you're not helping!" when I don't write an entire novel to justify my position, I'm going to link back to this thread. Pointless.

I've gone to the trouble of googling these articles for you (it took me a whole 30 seconds!). Please read any of them.

https://webcache.googleusercontent.com/search?q=cache:https:...

https://devopscube.com/immutable-infrastructure/

https://thenewstack.io/a-brief-look-at-immutable-infrastruct...

https://www.digitalocean.com/community/tutorials/what-is-imm...

https://www.hashicorp.com/resources/what-is-mutable-vs-immut...

https://www.techtarget.com/searchitoperations/definition/imm...

https://www.oreilly.com/radar/an-introduction-to-immutable-i...

https://www.terraformpilot.com/articles/mutable-vs-immutable...

https://www.bmc.com/blogs/immutable-infrastructure/

https://www.linode.com/docs/guides/what-is-immutable-infrast...

https://devops.com/immutable-infrastructure-the-next-step-fo...

https://openupthecloud.com/what-is-immutable-infrastructure/

https://www.opsramp.com/guides/why-kubernetes/infrastructure...

https://www.cloudbees.com/blog/immutable-infrastructure

https://www.daily-devops.com/devops/immutable/architecture-p...

http://radar.oreilly.com/2015/06/an-introduction-to-immutabl...

https://highops.com/insights/immutable-infrastructure-what-i...

https://docs.aws.amazon.com/wellarchitected/latest/financial...

17 days ago

aprdm

Maybe you're not the only expert in HN ? For someone to write what you wrote after 20y of experience is a bit interesting - and you did write a lot !

I might have more YOE than you do for example, and might have worked on bigger companies/infras than you did - what does it matter to the opinion at hand ?

17 days ago

crabbone

As someone who had to write infrastructure in Python, every time from scratch, for large projects: pyinfra isn't it (and neither is Ansible, if you care about that).

It will probably work for some simple and common cases, but they barely need any automation anyways...

The problem isn't even the tool itself, it's the lack of standards. Every large enough system is too unique to be easily managed by cookie-cutter tools like this one. Some people will bite the bullet anyways, and try to adapt general-purpose infra tools to their case. I've seen that too. This is a very miserable experience. Frustrating in that very obviously simple and necessary things are sometimes described as "impossible" due to how the chosen framework works. To contrast that, the home-brewed systems usually suffer from the lack of generality, worse user experience in general, quickly start lagging behind the underlying technology updates...

Also, out of popular languages, Python would be somewhere towards the bottom of the hierarchy if I had to choose a language to manage infrastructure. The only redeeming quality of Python is its popularity. On engineering merits alone its unremarkable at best.

----

PS.

    import click
If I see this in the project source code, I blacklist it and never look at it again. This is a red flag, a sure sign that the person writing it are clueless.
17 days ago