mirror of
https://github.com/syssec-utd/pylingual.git
synced 2026-05-10 18:39:03 -07:00
initial commit
This commit is contained in:
+18
@@ -0,0 +1,18 @@
|
|||||||
|
dataset/
|
||||||
|
venv/
|
||||||
|
*.pyc
|
||||||
|
.idea/
|
||||||
|
.python-version
|
||||||
|
poetry.toml
|
||||||
|
test/
|
||||||
|
.DS_Store
|
||||||
|
results/
|
||||||
|
.vscode/
|
||||||
|
errors/
|
||||||
|
logs/
|
||||||
|
segmentation_test_cases/
|
||||||
|
__pycache__/
|
||||||
|
.env
|
||||||
|
mise.toml
|
||||||
|
dist/
|
||||||
|
decompiled_*/
|
||||||
@@ -0,0 +1,674 @@
|
|||||||
|
GNU GENERAL PUBLIC LICENSE
|
||||||
|
Version 3, 29 June 2007
|
||||||
|
|
||||||
|
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
|
||||||
|
Everyone is permitted to copy and distribute verbatim copies
|
||||||
|
of this license document, but changing it is not allowed.
|
||||||
|
|
||||||
|
Preamble
|
||||||
|
|
||||||
|
The GNU General Public License is a free, copyleft license for
|
||||||
|
software and other kinds of works.
|
||||||
|
|
||||||
|
The licenses for most software and other practical works are designed
|
||||||
|
to take away your freedom to share and change the works. By contrast,
|
||||||
|
the GNU General Public License is intended to guarantee your freedom to
|
||||||
|
share and change all versions of a program--to make sure it remains free
|
||||||
|
software for all its users. We, the Free Software Foundation, use the
|
||||||
|
GNU General Public License for most of our software; it applies also to
|
||||||
|
any other work released this way by its authors. You can apply it to
|
||||||
|
your programs, too.
|
||||||
|
|
||||||
|
When we speak of free software, we are referring to freedom, not
|
||||||
|
price. Our General Public Licenses are designed to make sure that you
|
||||||
|
have the freedom to distribute copies of free software (and charge for
|
||||||
|
them if you wish), that you receive source code or can get it if you
|
||||||
|
want it, that you can change the software or use pieces of it in new
|
||||||
|
free programs, and that you know you can do these things.
|
||||||
|
|
||||||
|
To protect your rights, we need to prevent others from denying you
|
||||||
|
these rights or asking you to surrender the rights. Therefore, you have
|
||||||
|
certain responsibilities if you distribute copies of the software, or if
|
||||||
|
you modify it: responsibilities to respect the freedom of others.
|
||||||
|
|
||||||
|
For example, if you distribute copies of such a program, whether
|
||||||
|
gratis or for a fee, you must pass on to the recipients the same
|
||||||
|
freedoms that you received. You must make sure that they, too, receive
|
||||||
|
or can get the source code. And you must show them these terms so they
|
||||||
|
know their rights.
|
||||||
|
|
||||||
|
Developers that use the GNU GPL protect your rights with two steps:
|
||||||
|
(1) assert copyright on the software, and (2) offer you this License
|
||||||
|
giving you legal permission to copy, distribute and/or modify it.
|
||||||
|
|
||||||
|
For the developers' and authors' protection, the GPL clearly explains
|
||||||
|
that there is no warranty for this free software. For both users' and
|
||||||
|
authors' sake, the GPL requires that modified versions be marked as
|
||||||
|
changed, so that their problems will not be attributed erroneously to
|
||||||
|
authors of previous versions.
|
||||||
|
|
||||||
|
Some devices are designed to deny users access to install or run
|
||||||
|
modified versions of the software inside them, although the manufacturer
|
||||||
|
can do so. This is fundamentally incompatible with the aim of
|
||||||
|
protecting users' freedom to change the software. The systematic
|
||||||
|
pattern of such abuse occurs in the area of products for individuals to
|
||||||
|
use, which is precisely where it is most unacceptable. Therefore, we
|
||||||
|
have designed this version of the GPL to prohibit the practice for those
|
||||||
|
products. If such problems arise substantially in other domains, we
|
||||||
|
stand ready to extend this provision to those domains in future versions
|
||||||
|
of the GPL, as needed to protect the freedom of users.
|
||||||
|
|
||||||
|
Finally, every program is threatened constantly by software patents.
|
||||||
|
States should not allow patents to restrict development and use of
|
||||||
|
software on general-purpose computers, but in those that do, we wish to
|
||||||
|
avoid the special danger that patents applied to a free program could
|
||||||
|
make it effectively proprietary. To prevent this, the GPL assures that
|
||||||
|
patents cannot be used to render the program non-free.
|
||||||
|
|
||||||
|
The precise terms and conditions for copying, distribution and
|
||||||
|
modification follow.
|
||||||
|
|
||||||
|
TERMS AND CONDITIONS
|
||||||
|
|
||||||
|
0. Definitions.
|
||||||
|
|
||||||
|
"This License" refers to version 3 of the GNU General Public License.
|
||||||
|
|
||||||
|
"Copyright" also means copyright-like laws that apply to other kinds of
|
||||||
|
works, such as semiconductor masks.
|
||||||
|
|
||||||
|
"The Program" refers to any copyrightable work licensed under this
|
||||||
|
License. Each licensee is addressed as "you". "Licensees" and
|
||||||
|
"recipients" may be individuals or organizations.
|
||||||
|
|
||||||
|
To "modify" a work means to copy from or adapt all or part of the work
|
||||||
|
in a fashion requiring copyright permission, other than the making of an
|
||||||
|
exact copy. The resulting work is called a "modified version" of the
|
||||||
|
earlier work or a work "based on" the earlier work.
|
||||||
|
|
||||||
|
A "covered work" means either the unmodified Program or a work based
|
||||||
|
on the Program.
|
||||||
|
|
||||||
|
To "propagate" a work means to do anything with it that, without
|
||||||
|
permission, would make you directly or secondarily liable for
|
||||||
|
infringement under applicable copyright law, except executing it on a
|
||||||
|
computer or modifying a private copy. Propagation includes copying,
|
||||||
|
distribution (with or without modification), making available to the
|
||||||
|
public, and in some countries other activities as well.
|
||||||
|
|
||||||
|
To "convey" a work means any kind of propagation that enables other
|
||||||
|
parties to make or receive copies. Mere interaction with a user through
|
||||||
|
a computer network, with no transfer of a copy, is not conveying.
|
||||||
|
|
||||||
|
An interactive user interface displays "Appropriate Legal Notices"
|
||||||
|
to the extent that it includes a convenient and prominently visible
|
||||||
|
feature that (1) displays an appropriate copyright notice, and (2)
|
||||||
|
tells the user that there is no warranty for the work (except to the
|
||||||
|
extent that warranties are provided), that licensees may convey the
|
||||||
|
work under this License, and how to view a copy of this License. If
|
||||||
|
the interface presents a list of user commands or options, such as a
|
||||||
|
menu, a prominent item in the list meets this criterion.
|
||||||
|
|
||||||
|
1. Source Code.
|
||||||
|
|
||||||
|
The "source code" for a work means the preferred form of the work
|
||||||
|
for making modifications to it. "Object code" means any non-source
|
||||||
|
form of a work.
|
||||||
|
|
||||||
|
A "Standard Interface" means an interface that either is an official
|
||||||
|
standard defined by a recognized standards body, or, in the case of
|
||||||
|
interfaces specified for a particular programming language, one that
|
||||||
|
is widely used among developers working in that language.
|
||||||
|
|
||||||
|
The "System Libraries" of an executable work include anything, other
|
||||||
|
than the work as a whole, that (a) is included in the normal form of
|
||||||
|
packaging a Major Component, but which is not part of that Major
|
||||||
|
Component, and (b) serves only to enable use of the work with that
|
||||||
|
Major Component, or to implement a Standard Interface for which an
|
||||||
|
implementation is available to the public in source code form. A
|
||||||
|
"Major Component", in this context, means a major essential component
|
||||||
|
(kernel, window system, and so on) of the specific operating system
|
||||||
|
(if any) on which the executable work runs, or a compiler used to
|
||||||
|
produce the work, or an object code interpreter used to run it.
|
||||||
|
|
||||||
|
The "Corresponding Source" for a work in object code form means all
|
||||||
|
the source code needed to generate, install, and (for an executable
|
||||||
|
work) run the object code and to modify the work, including scripts to
|
||||||
|
control those activities. However, it does not include the work's
|
||||||
|
System Libraries, or general-purpose tools or generally available free
|
||||||
|
programs which are used unmodified in performing those activities but
|
||||||
|
which are not part of the work. For example, Corresponding Source
|
||||||
|
includes interface definition files associated with source files for
|
||||||
|
the work, and the source code for shared libraries and dynamically
|
||||||
|
linked subprograms that the work is specifically designed to require,
|
||||||
|
such as by intimate data communication or control flow between those
|
||||||
|
subprograms and other parts of the work.
|
||||||
|
|
||||||
|
The Corresponding Source need not include anything that users
|
||||||
|
can regenerate automatically from other parts of the Corresponding
|
||||||
|
Source.
|
||||||
|
|
||||||
|
The Corresponding Source for a work in source code form is that
|
||||||
|
same work.
|
||||||
|
|
||||||
|
2. Basic Permissions.
|
||||||
|
|
||||||
|
All rights granted under this License are granted for the term of
|
||||||
|
copyright on the Program, and are irrevocable provided the stated
|
||||||
|
conditions are met. This License explicitly affirms your unlimited
|
||||||
|
permission to run the unmodified Program. The output from running a
|
||||||
|
covered work is covered by this License only if the output, given its
|
||||||
|
content, constitutes a covered work. This License acknowledges your
|
||||||
|
rights of fair use or other equivalent, as provided by copyright law.
|
||||||
|
|
||||||
|
You may make, run and propagate covered works that you do not
|
||||||
|
convey, without conditions so long as your license otherwise remains
|
||||||
|
in force. You may convey covered works to others for the sole purpose
|
||||||
|
of having them make modifications exclusively for you, or provide you
|
||||||
|
with facilities for running those works, provided that you comply with
|
||||||
|
the terms of this License in conveying all material for which you do
|
||||||
|
not control copyright. Those thus making or running the covered works
|
||||||
|
for you must do so exclusively on your behalf, under your direction
|
||||||
|
and control, on terms that prohibit them from making any copies of
|
||||||
|
your copyrighted material outside their relationship with you.
|
||||||
|
|
||||||
|
Conveying under any other circumstances is permitted solely under
|
||||||
|
the conditions stated below. Sublicensing is not allowed; section 10
|
||||||
|
makes it unnecessary.
|
||||||
|
|
||||||
|
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
|
||||||
|
|
||||||
|
No covered work shall be deemed part of an effective technological
|
||||||
|
measure under any applicable law fulfilling obligations under article
|
||||||
|
11 of the WIPO copyright treaty adopted on 20 December 1996, or
|
||||||
|
similar laws prohibiting or restricting circumvention of such
|
||||||
|
measures.
|
||||||
|
|
||||||
|
When you convey a covered work, you waive any legal power to forbid
|
||||||
|
circumvention of technological measures to the extent such circumvention
|
||||||
|
is effected by exercising rights under this License with respect to
|
||||||
|
the covered work, and you disclaim any intention to limit operation or
|
||||||
|
modification of the work as a means of enforcing, against the work's
|
||||||
|
users, your or third parties' legal rights to forbid circumvention of
|
||||||
|
technological measures.
|
||||||
|
|
||||||
|
4. Conveying Verbatim Copies.
|
||||||
|
|
||||||
|
You may convey verbatim copies of the Program's source code as you
|
||||||
|
receive it, in any medium, provided that you conspicuously and
|
||||||
|
appropriately publish on each copy an appropriate copyright notice;
|
||||||
|
keep intact all notices stating that this License and any
|
||||||
|
non-permissive terms added in accord with section 7 apply to the code;
|
||||||
|
keep intact all notices of the absence of any warranty; and give all
|
||||||
|
recipients a copy of this License along with the Program.
|
||||||
|
|
||||||
|
You may charge any price or no price for each copy that you convey,
|
||||||
|
and you may offer support or warranty protection for a fee.
|
||||||
|
|
||||||
|
5. Conveying Modified Source Versions.
|
||||||
|
|
||||||
|
You may convey a work based on the Program, or the modifications to
|
||||||
|
produce it from the Program, in the form of source code under the
|
||||||
|
terms of section 4, provided that you also meet all of these conditions:
|
||||||
|
|
||||||
|
a) The work must carry prominent notices stating that you modified
|
||||||
|
it, and giving a relevant date.
|
||||||
|
|
||||||
|
b) The work must carry prominent notices stating that it is
|
||||||
|
released under this License and any conditions added under section
|
||||||
|
7. This requirement modifies the requirement in section 4 to
|
||||||
|
"keep intact all notices".
|
||||||
|
|
||||||
|
c) You must license the entire work, as a whole, under this
|
||||||
|
License to anyone who comes into possession of a copy. This
|
||||||
|
License will therefore apply, along with any applicable section 7
|
||||||
|
additional terms, to the whole of the work, and all its parts,
|
||||||
|
regardless of how they are packaged. This License gives no
|
||||||
|
permission to license the work in any other way, but it does not
|
||||||
|
invalidate such permission if you have separately received it.
|
||||||
|
|
||||||
|
d) If the work has interactive user interfaces, each must display
|
||||||
|
Appropriate Legal Notices; however, if the Program has interactive
|
||||||
|
interfaces that do not display Appropriate Legal Notices, your
|
||||||
|
work need not make them do so.
|
||||||
|
|
||||||
|
A compilation of a covered work with other separate and independent
|
||||||
|
works, which are not by their nature extensions of the covered work,
|
||||||
|
and which are not combined with it such as to form a larger program,
|
||||||
|
in or on a volume of a storage or distribution medium, is called an
|
||||||
|
"aggregate" if the compilation and its resulting copyright are not
|
||||||
|
used to limit the access or legal rights of the compilation's users
|
||||||
|
beyond what the individual works permit. Inclusion of a covered work
|
||||||
|
in an aggregate does not cause this License to apply to the other
|
||||||
|
parts of the aggregate.
|
||||||
|
|
||||||
|
6. Conveying Non-Source Forms.
|
||||||
|
|
||||||
|
You may convey a covered work in object code form under the terms
|
||||||
|
of sections 4 and 5, provided that you also convey the
|
||||||
|
machine-readable Corresponding Source under the terms of this License,
|
||||||
|
in one of these ways:
|
||||||
|
|
||||||
|
a) Convey the object code in, or embodied in, a physical product
|
||||||
|
(including a physical distribution medium), accompanied by the
|
||||||
|
Corresponding Source fixed on a durable physical medium
|
||||||
|
customarily used for software interchange.
|
||||||
|
|
||||||
|
b) Convey the object code in, or embodied in, a physical product
|
||||||
|
(including a physical distribution medium), accompanied by a
|
||||||
|
written offer, valid for at least three years and valid for as
|
||||||
|
long as you offer spare parts or customer support for that product
|
||||||
|
model, to give anyone who possesses the object code either (1) a
|
||||||
|
copy of the Corresponding Source for all the software in the
|
||||||
|
product that is covered by this License, on a durable physical
|
||||||
|
medium customarily used for software interchange, for a price no
|
||||||
|
more than your reasonable cost of physically performing this
|
||||||
|
conveying of source, or (2) access to copy the
|
||||||
|
Corresponding Source from a network server at no charge.
|
||||||
|
|
||||||
|
c) Convey individual copies of the object code with a copy of the
|
||||||
|
written offer to provide the Corresponding Source. This
|
||||||
|
alternative is allowed only occasionally and noncommercially, and
|
||||||
|
only if you received the object code with such an offer, in accord
|
||||||
|
with subsection 6b.
|
||||||
|
|
||||||
|
d) Convey the object code by offering access from a designated
|
||||||
|
place (gratis or for a charge), and offer equivalent access to the
|
||||||
|
Corresponding Source in the same way through the same place at no
|
||||||
|
further charge. You need not require recipients to copy the
|
||||||
|
Corresponding Source along with the object code. If the place to
|
||||||
|
copy the object code is a network server, the Corresponding Source
|
||||||
|
may be on a different server (operated by you or a third party)
|
||||||
|
that supports equivalent copying facilities, provided you maintain
|
||||||
|
clear directions next to the object code saying where to find the
|
||||||
|
Corresponding Source. Regardless of what server hosts the
|
||||||
|
Corresponding Source, you remain obligated to ensure that it is
|
||||||
|
available for as long as needed to satisfy these requirements.
|
||||||
|
|
||||||
|
e) Convey the object code using peer-to-peer transmission, provided
|
||||||
|
you inform other peers where the object code and Corresponding
|
||||||
|
Source of the work are being offered to the general public at no
|
||||||
|
charge under subsection 6d.
|
||||||
|
|
||||||
|
A separable portion of the object code, whose source code is excluded
|
||||||
|
from the Corresponding Source as a System Library, need not be
|
||||||
|
included in conveying the object code work.
|
||||||
|
|
||||||
|
A "User Product" is either (1) a "consumer product", which means any
|
||||||
|
tangible personal property which is normally used for personal, family,
|
||||||
|
or household purposes, or (2) anything designed or sold for incorporation
|
||||||
|
into a dwelling. In determining whether a product is a consumer product,
|
||||||
|
doubtful cases shall be resolved in favor of coverage. For a particular
|
||||||
|
product received by a particular user, "normally used" refers to a
|
||||||
|
typical or common use of that class of product, regardless of the status
|
||||||
|
of the particular user or of the way in which the particular user
|
||||||
|
actually uses, or expects or is expected to use, the product. A product
|
||||||
|
is a consumer product regardless of whether the product has substantial
|
||||||
|
commercial, industrial or non-consumer uses, unless such uses represent
|
||||||
|
the only significant mode of use of the product.
|
||||||
|
|
||||||
|
"Installation Information" for a User Product means any methods,
|
||||||
|
procedures, authorization keys, or other information required to install
|
||||||
|
and execute modified versions of a covered work in that User Product from
|
||||||
|
a modified version of its Corresponding Source. The information must
|
||||||
|
suffice to ensure that the continued functioning of the modified object
|
||||||
|
code is in no case prevented or interfered with solely because
|
||||||
|
modification has been made.
|
||||||
|
|
||||||
|
If you convey an object code work under this section in, or with, or
|
||||||
|
specifically for use in, a User Product, and the conveying occurs as
|
||||||
|
part of a transaction in which the right of possession and use of the
|
||||||
|
User Product is transferred to the recipient in perpetuity or for a
|
||||||
|
fixed term (regardless of how the transaction is characterized), the
|
||||||
|
Corresponding Source conveyed under this section must be accompanied
|
||||||
|
by the Installation Information. But this requirement does not apply
|
||||||
|
if neither you nor any third party retains the ability to install
|
||||||
|
modified object code on the User Product (for example, the work has
|
||||||
|
been installed in ROM).
|
||||||
|
|
||||||
|
The requirement to provide Installation Information does not include a
|
||||||
|
requirement to continue to provide support service, warranty, or updates
|
||||||
|
for a work that has been modified or installed by the recipient, or for
|
||||||
|
the User Product in which it has been modified or installed. Access to a
|
||||||
|
network may be denied when the modification itself materially and
|
||||||
|
adversely affects the operation of the network or violates the rules and
|
||||||
|
protocols for communication across the network.
|
||||||
|
|
||||||
|
Corresponding Source conveyed, and Installation Information provided,
|
||||||
|
in accord with this section must be in a format that is publicly
|
||||||
|
documented (and with an implementation available to the public in
|
||||||
|
source code form), and must require no special password or key for
|
||||||
|
unpacking, reading or copying.
|
||||||
|
|
||||||
|
7. Additional Terms.
|
||||||
|
|
||||||
|
"Additional permissions" are terms that supplement the terms of this
|
||||||
|
License by making exceptions from one or more of its conditions.
|
||||||
|
Additional permissions that are applicable to the entire Program shall
|
||||||
|
be treated as though they were included in this License, to the extent
|
||||||
|
that they are valid under applicable law. If additional permissions
|
||||||
|
apply only to part of the Program, that part may be used separately
|
||||||
|
under those permissions, but the entire Program remains governed by
|
||||||
|
this License without regard to the additional permissions.
|
||||||
|
|
||||||
|
When you convey a copy of a covered work, you may at your option
|
||||||
|
remove any additional permissions from that copy, or from any part of
|
||||||
|
it. (Additional permissions may be written to require their own
|
||||||
|
removal in certain cases when you modify the work.) You may place
|
||||||
|
additional permissions on material, added by you to a covered work,
|
||||||
|
for which you have or can give appropriate copyright permission.
|
||||||
|
|
||||||
|
Notwithstanding any other provision of this License, for material you
|
||||||
|
add to a covered work, you may (if authorized by the copyright holders of
|
||||||
|
that material) supplement the terms of this License with terms:
|
||||||
|
|
||||||
|
a) Disclaiming warranty or limiting liability differently from the
|
||||||
|
terms of sections 15 and 16 of this License; or
|
||||||
|
|
||||||
|
b) Requiring preservation of specified reasonable legal notices or
|
||||||
|
author attributions in that material or in the Appropriate Legal
|
||||||
|
Notices displayed by works containing it; or
|
||||||
|
|
||||||
|
c) Prohibiting misrepresentation of the origin of that material, or
|
||||||
|
requiring that modified versions of such material be marked in
|
||||||
|
reasonable ways as different from the original version; or
|
||||||
|
|
||||||
|
d) Limiting the use for publicity purposes of names of licensors or
|
||||||
|
authors of the material; or
|
||||||
|
|
||||||
|
e) Declining to grant rights under trademark law for use of some
|
||||||
|
trade names, trademarks, or service marks; or
|
||||||
|
|
||||||
|
f) Requiring indemnification of licensors and authors of that
|
||||||
|
material by anyone who conveys the material (or modified versions of
|
||||||
|
it) with contractual assumptions of liability to the recipient, for
|
||||||
|
any liability that these contractual assumptions directly impose on
|
||||||
|
those licensors and authors.
|
||||||
|
|
||||||
|
All other non-permissive additional terms are considered "further
|
||||||
|
restrictions" within the meaning of section 10. If the Program as you
|
||||||
|
received it, or any part of it, contains a notice stating that it is
|
||||||
|
governed by this License along with a term that is a further
|
||||||
|
restriction, you may remove that term. If a license document contains
|
||||||
|
a further restriction but permits relicensing or conveying under this
|
||||||
|
License, you may add to a covered work material governed by the terms
|
||||||
|
of that license document, provided that the further restriction does
|
||||||
|
not survive such relicensing or conveying.
|
||||||
|
|
||||||
|
If you add terms to a covered work in accord with this section, you
|
||||||
|
must place, in the relevant source files, a statement of the
|
||||||
|
additional terms that apply to those files, or a notice indicating
|
||||||
|
where to find the applicable terms.
|
||||||
|
|
||||||
|
Additional terms, permissive or non-permissive, may be stated in the
|
||||||
|
form of a separately written license, or stated as exceptions;
|
||||||
|
the above requirements apply either way.
|
||||||
|
|
||||||
|
8. Termination.
|
||||||
|
|
||||||
|
You may not propagate or modify a covered work except as expressly
|
||||||
|
provided under this License. Any attempt otherwise to propagate or
|
||||||
|
modify it is void, and will automatically terminate your rights under
|
||||||
|
this License (including any patent licenses granted under the third
|
||||||
|
paragraph of section 11).
|
||||||
|
|
||||||
|
However, if you cease all violation of this License, then your
|
||||||
|
license from a particular copyright holder is reinstated (a)
|
||||||
|
provisionally, unless and until the copyright holder explicitly and
|
||||||
|
finally terminates your license, and (b) permanently, if the copyright
|
||||||
|
holder fails to notify you of the violation by some reasonable means
|
||||||
|
prior to 60 days after the cessation.
|
||||||
|
|
||||||
|
Moreover, your license from a particular copyright holder is
|
||||||
|
reinstated permanently if the copyright holder notifies you of the
|
||||||
|
violation by some reasonable means, this is the first time you have
|
||||||
|
received notice of violation of this License (for any work) from that
|
||||||
|
copyright holder, and you cure the violation prior to 30 days after
|
||||||
|
your receipt of the notice.
|
||||||
|
|
||||||
|
Termination of your rights under this section does not terminate the
|
||||||
|
licenses of parties who have received copies or rights from you under
|
||||||
|
this License. If your rights have been terminated and not permanently
|
||||||
|
reinstated, you do not qualify to receive new licenses for the same
|
||||||
|
material under section 10.
|
||||||
|
|
||||||
|
9. Acceptance Not Required for Having Copies.
|
||||||
|
|
||||||
|
You are not required to accept this License in order to receive or
|
||||||
|
run a copy of the Program. Ancillary propagation of a covered work
|
||||||
|
occurring solely as a consequence of using peer-to-peer transmission
|
||||||
|
to receive a copy likewise does not require acceptance. However,
|
||||||
|
nothing other than this License grants you permission to propagate or
|
||||||
|
modify any covered work. These actions infringe copyright if you do
|
||||||
|
not accept this License. Therefore, by modifying or propagating a
|
||||||
|
covered work, you indicate your acceptance of this License to do so.
|
||||||
|
|
||||||
|
10. Automatic Licensing of Downstream Recipients.
|
||||||
|
|
||||||
|
Each time you convey a covered work, the recipient automatically
|
||||||
|
receives a license from the original licensors, to run, modify and
|
||||||
|
propagate that work, subject to this License. You are not responsible
|
||||||
|
for enforcing compliance by third parties with this License.
|
||||||
|
|
||||||
|
An "entity transaction" is a transaction transferring control of an
|
||||||
|
organization, or substantially all assets of one, or subdividing an
|
||||||
|
organization, or merging organizations. If propagation of a covered
|
||||||
|
work results from an entity transaction, each party to that
|
||||||
|
transaction who receives a copy of the work also receives whatever
|
||||||
|
licenses to the work the party's predecessor in interest had or could
|
||||||
|
give under the previous paragraph, plus a right to possession of the
|
||||||
|
Corresponding Source of the work from the predecessor in interest, if
|
||||||
|
the predecessor has it or can get it with reasonable efforts.
|
||||||
|
|
||||||
|
You may not impose any further restrictions on the exercise of the
|
||||||
|
rights granted or affirmed under this License. For example, you may
|
||||||
|
not impose a license fee, royalty, or other charge for exercise of
|
||||||
|
rights granted under this License, and you may not initiate litigation
|
||||||
|
(including a cross-claim or counterclaim in a lawsuit) alleging that
|
||||||
|
any patent claim is infringed by making, using, selling, offering for
|
||||||
|
sale, or importing the Program or any portion of it.
|
||||||
|
|
||||||
|
11. Patents.
|
||||||
|
|
||||||
|
A "contributor" is a copyright holder who authorizes use under this
|
||||||
|
License of the Program or a work on which the Program is based. The
|
||||||
|
work thus licensed is called the contributor's "contributor version".
|
||||||
|
|
||||||
|
A contributor's "essential patent claims" are all patent claims
|
||||||
|
owned or controlled by the contributor, whether already acquired or
|
||||||
|
hereafter acquired, that would be infringed by some manner, permitted
|
||||||
|
by this License, of making, using, or selling its contributor version,
|
||||||
|
but do not include claims that would be infringed only as a
|
||||||
|
consequence of further modification of the contributor version. For
|
||||||
|
purposes of this definition, "control" includes the right to grant
|
||||||
|
patent sublicenses in a manner consistent with the requirements of
|
||||||
|
this License.
|
||||||
|
|
||||||
|
Each contributor grants you a non-exclusive, worldwide, royalty-free
|
||||||
|
patent license under the contributor's essential patent claims, to
|
||||||
|
make, use, sell, offer for sale, import and otherwise run, modify and
|
||||||
|
propagate the contents of its contributor version.
|
||||||
|
|
||||||
|
In the following three paragraphs, a "patent license" is any express
|
||||||
|
agreement or commitment, however denominated, not to enforce a patent
|
||||||
|
(such as an express permission to practice a patent or covenant not to
|
||||||
|
sue for patent infringement). To "grant" such a patent license to a
|
||||||
|
party means to make such an agreement or commitment not to enforce a
|
||||||
|
patent against the party.
|
||||||
|
|
||||||
|
If you convey a covered work, knowingly relying on a patent license,
|
||||||
|
and the Corresponding Source of the work is not available for anyone
|
||||||
|
to copy, free of charge and under the terms of this License, through a
|
||||||
|
publicly available network server or other readily accessible means,
|
||||||
|
then you must either (1) cause the Corresponding Source to be so
|
||||||
|
available, or (2) arrange to deprive yourself of the benefit of the
|
||||||
|
patent license for this particular work, or (3) arrange, in a manner
|
||||||
|
consistent with the requirements of this License, to extend the patent
|
||||||
|
license to downstream recipients. "Knowingly relying" means you have
|
||||||
|
actual knowledge that, but for the patent license, your conveying the
|
||||||
|
covered work in a country, or your recipient's use of the covered work
|
||||||
|
in a country, would infringe one or more identifiable patents in that
|
||||||
|
country that you have reason to believe are valid.
|
||||||
|
|
||||||
|
If, pursuant to or in connection with a single transaction or
|
||||||
|
arrangement, you convey, or propagate by procuring conveyance of, a
|
||||||
|
covered work, and grant a patent license to some of the parties
|
||||||
|
receiving the covered work authorizing them to use, propagate, modify
|
||||||
|
or convey a specific copy of the covered work, then the patent license
|
||||||
|
you grant is automatically extended to all recipients of the covered
|
||||||
|
work and works based on it.
|
||||||
|
|
||||||
|
A patent license is "discriminatory" if it does not include within
|
||||||
|
the scope of its coverage, prohibits the exercise of, or is
|
||||||
|
conditioned on the non-exercise of one or more of the rights that are
|
||||||
|
specifically granted under this License. You may not convey a covered
|
||||||
|
work if you are a party to an arrangement with a third party that is
|
||||||
|
in the business of distributing software, under which you make payment
|
||||||
|
to the third party based on the extent of your activity of conveying
|
||||||
|
the work, and under which the third party grants, to any of the
|
||||||
|
parties who would receive the covered work from you, a discriminatory
|
||||||
|
patent license (a) in connection with copies of the covered work
|
||||||
|
conveyed by you (or copies made from those copies), or (b) primarily
|
||||||
|
for and in connection with specific products or compilations that
|
||||||
|
contain the covered work, unless you entered into that arrangement,
|
||||||
|
or that patent license was granted, prior to 28 March 2007.
|
||||||
|
|
||||||
|
Nothing in this License shall be construed as excluding or limiting
|
||||||
|
any implied license or other defenses to infringement that may
|
||||||
|
otherwise be available to you under applicable patent law.
|
||||||
|
|
||||||
|
12. No Surrender of Others' Freedom.
|
||||||
|
|
||||||
|
If conditions are imposed on you (whether by court order, agreement or
|
||||||
|
otherwise) that contradict the conditions of this License, they do not
|
||||||
|
excuse you from the conditions of this License. If you cannot convey a
|
||||||
|
covered work so as to satisfy simultaneously your obligations under this
|
||||||
|
License and any other pertinent obligations, then as a consequence you may
|
||||||
|
not convey it at all. For example, if you agree to terms that obligate you
|
||||||
|
to collect a royalty for further conveying from those to whom you convey
|
||||||
|
the Program, the only way you could satisfy both those terms and this
|
||||||
|
License would be to refrain entirely from conveying the Program.
|
||||||
|
|
||||||
|
13. Use with the GNU Affero General Public License.
|
||||||
|
|
||||||
|
Notwithstanding any other provision of this License, you have
|
||||||
|
permission to link or combine any covered work with a work licensed
|
||||||
|
under version 3 of the GNU Affero General Public License into a single
|
||||||
|
combined work, and to convey the resulting work. The terms of this
|
||||||
|
License will continue to apply to the part which is the covered work,
|
||||||
|
but the special requirements of the GNU Affero General Public License,
|
||||||
|
section 13, concerning interaction through a network will apply to the
|
||||||
|
combination as such.
|
||||||
|
|
||||||
|
14. Revised Versions of this License.
|
||||||
|
|
||||||
|
The Free Software Foundation may publish revised and/or new versions of
|
||||||
|
the GNU General Public License from time to time. Such new versions will
|
||||||
|
be similar in spirit to the present version, but may differ in detail to
|
||||||
|
address new problems or concerns.
|
||||||
|
|
||||||
|
Each version is given a distinguishing version number. If the
|
||||||
|
Program specifies that a certain numbered version of the GNU General
|
||||||
|
Public License "or any later version" applies to it, you have the
|
||||||
|
option of following the terms and conditions either of that numbered
|
||||||
|
version or of any later version published by the Free Software
|
||||||
|
Foundation. If the Program does not specify a version number of the
|
||||||
|
GNU General Public License, you may choose any version ever published
|
||||||
|
by the Free Software Foundation.
|
||||||
|
|
||||||
|
If the Program specifies that a proxy can decide which future
|
||||||
|
versions of the GNU General Public License can be used, that proxy's
|
||||||
|
public statement of acceptance of a version permanently authorizes you
|
||||||
|
to choose that version for the Program.
|
||||||
|
|
||||||
|
Later license versions may give you additional or different
|
||||||
|
permissions. However, no additional obligations are imposed on any
|
||||||
|
author or copyright holder as a result of your choosing to follow a
|
||||||
|
later version.
|
||||||
|
|
||||||
|
15. Disclaimer of Warranty.
|
||||||
|
|
||||||
|
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
|
||||||
|
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
|
||||||
|
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
|
||||||
|
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
|
||||||
|
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||||
|
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
|
||||||
|
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
|
||||||
|
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||||
|
|
||||||
|
16. Limitation of Liability.
|
||||||
|
|
||||||
|
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||||
|
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
|
||||||
|
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
|
||||||
|
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
|
||||||
|
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
|
||||||
|
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
|
||||||
|
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
|
||||||
|
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
|
||||||
|
SUCH DAMAGES.
|
||||||
|
|
||||||
|
17. Interpretation of Sections 15 and 16.
|
||||||
|
|
||||||
|
If the disclaimer of warranty and limitation of liability provided
|
||||||
|
above cannot be given local legal effect according to their terms,
|
||||||
|
reviewing courts shall apply local law that most closely approximates
|
||||||
|
an absolute waiver of all civil liability in connection with the
|
||||||
|
Program, unless a warranty or assumption of liability accompanies a
|
||||||
|
copy of the Program in return for a fee.
|
||||||
|
|
||||||
|
END OF TERMS AND CONDITIONS
|
||||||
|
|
||||||
|
How to Apply These Terms to Your New Programs
|
||||||
|
|
||||||
|
If you develop a new program, and you want it to be of the greatest
|
||||||
|
possible use to the public, the best way to achieve this is to make it
|
||||||
|
free software which everyone can redistribute and change under these terms.
|
||||||
|
|
||||||
|
To do so, attach the following notices to the program. It is safest
|
||||||
|
to attach them to the start of each source file to most effectively
|
||||||
|
state the exclusion of warranty; and each file should have at least
|
||||||
|
the "copyright" line and a pointer to where the full notice is found.
|
||||||
|
|
||||||
|
<one line to give the program's name and a brief idea of what it does.>
|
||||||
|
Copyright (C) <year> <name of author>
|
||||||
|
|
||||||
|
This program is free software: you can redistribute it and/or modify
|
||||||
|
it under the terms of the GNU General Public License as published by
|
||||||
|
the Free Software Foundation, either version 3 of the License, or
|
||||||
|
(at your option) any later version.
|
||||||
|
|
||||||
|
This program is distributed in the hope that it will be useful,
|
||||||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
GNU General Public License for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU General Public License
|
||||||
|
along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
|
Also add information on how to contact you by electronic and paper mail.
|
||||||
|
|
||||||
|
If the program does terminal interaction, make it output a short
|
||||||
|
notice like this when it starts in an interactive mode:
|
||||||
|
|
||||||
|
<program> Copyright (C) <year> <name of author>
|
||||||
|
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||||
|
This is free software, and you are welcome to redistribute it
|
||||||
|
under certain conditions; type `show c' for details.
|
||||||
|
|
||||||
|
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||||
|
parts of the General Public License. Of course, your program's commands
|
||||||
|
might be different; for a GUI interface, you would use an "about box".
|
||||||
|
|
||||||
|
You should also get your employer (if you work as a programmer) or school,
|
||||||
|
if any, to sign a "copyright disclaimer" for the program, if necessary.
|
||||||
|
For more information on this, and how to apply and follow the GNU GPL, see
|
||||||
|
<https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
|
The GNU General Public License does not permit incorporating your program
|
||||||
|
into proprietary programs. If your program is a subroutine library, you
|
||||||
|
may consider it more useful to permit linking proprietary applications with
|
||||||
|
the library. If this is what you want to do, use the GNU Lesser General
|
||||||
|
Public License instead of this License. But first, please read
|
||||||
|
<https://www.gnu.org/licenses/why-not-lgpl.html>.
|
||||||
@@ -0,0 +1,60 @@
|
|||||||
|
# PyLingual - Python Decompiler for 3.6+
|
||||||
|
|
||||||
|
PyLingual is a CPython bytecode decompiler supporting all released Python versions since 3.6. For information about the design and implementation of PyLingual, please refer to our [research paper](https://www.computer.org/csdl/proceedings-article/sp/2025/223600a052/21B7QZB86cg).
|
||||||
|
|
||||||
|
PyLingual can be run through our [web service](https://pylingual.io) or run locally.
|
||||||
|
|
||||||
|
This codebase is optimized for readability and future extension, so there may initially be some control flow accuracy regression compared to the version hosted on the web service.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Python 3.11 or higher
|
||||||
|
|
||||||
|
### Compiling bytecode
|
||||||
|
|
||||||
|
Some parts of PyLingual require the ability to compile bytecode in a different Python version (equivalence check and model training). For this, you will need the following:
|
||||||
|
|
||||||
|
- [pyenv](https://github.com/pyenv/pyenv) with all Python versions you want to compile to
|
||||||
|
- Unix-like operating system (pyenv does not support Windows)
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
Install from source, using [Poetry](https://python-poetry.org/):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
git clone https://github.com/syssec-utd/pylingual
|
||||||
|
cd pylingual
|
||||||
|
python -m venv venv
|
||||||
|
source venv/bin/activate
|
||||||
|
pip install poetry
|
||||||
|
poetry install
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
```
|
||||||
|
Usage: pylingual [OPTIONS] [FILES]...
|
||||||
|
|
||||||
|
End to end pipeline to decompile Python bytecode into source code.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
-o, --out-dir PATH The directory to export results to.
|
||||||
|
-c, --config-file PATH Config file for model information.
|
||||||
|
-v, --version VERSION Python version of the .pyc, default is auto
|
||||||
|
detection.
|
||||||
|
-k, --top-k INT Maximum number of additional segmentations to
|
||||||
|
consider.
|
||||||
|
-q, --quiet Suppress console output.
|
||||||
|
--trust-lnotab Use the lnotab for segmentation instead of the
|
||||||
|
segmentation model.
|
||||||
|
--init-pyenv Install pyenv before decompiling.
|
||||||
|
-h, --help Show this message and exit.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Demo
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Support
|
||||||
|
|
||||||
|
If you have any issues for installing and using PyLingual, please create an issue or send your message via our support email at pylingual@gmail.com.
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
# Model Training
|
||||||
|
|
||||||
|
PyLingual's accuracy is dependent on having accurate segmentation and statement models [^1]. The segmentation model divides a list of bytecode instructions into groups for each source instruction. The statement model transforms each group of instructions into source code. The instructions for training these models is as follows:
|
||||||
|
|
||||||
|
## Dataset generation
|
||||||
|
|
||||||
|
First install [pyenv](https://github.com/pyenv/pyenv) and the required Python versions for the dataset. Create a dataset JSON file based off the sample (`sample_jsons/py36-sample-data.json`).
|
||||||
|
|
||||||
|
The dataset directory should be structured like so, with only one `.py` file per directory:
|
||||||
|
|
||||||
|
```
|
||||||
|
dataset
|
||||||
|
├── 0
|
||||||
|
│ └── file.py
|
||||||
|
├── 1
|
||||||
|
│ └── file.py
|
||||||
|
...
|
||||||
|
├── 999
|
||||||
|
│ └── file.py
|
||||||
|
└── 1000
|
||||||
|
└── file.py
|
||||||
|
```
|
||||||
|
|
||||||
|
The names of the inner directories and files do not matter. Then create the dataset:
|
||||||
|
|
||||||
|
```
|
||||||
|
python prepare_dataset.py <path to JSON>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Segmentation model
|
||||||
|
|
||||||
|
Create a segmentation model JSON file based off the sample (`sample_jsons/py36-sample-segmentation.json`). Then train the model:
|
||||||
|
|
||||||
|
```
|
||||||
|
python train_models.py --segmentation <path to JSON>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Statement model
|
||||||
|
|
||||||
|
Create a statement model JSON file based off the sample (`sample_jsons/py36-sample-statement.json`). Then train the model:
|
||||||
|
|
||||||
|
```
|
||||||
|
python train_models.py --statement <path to JSON>
|
||||||
|
```
|
||||||
|
|
||||||
|
Once models are trained, update `../pylingual/decompiler_config.yaml` or create a separate config file by replacing the old models with the newly trained ones.
|
||||||
|
|
||||||
|
[^1]: [pylingual models](https://huggingface.co/syssec-utd).
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
from dataclasses import dataclass
|
||||||
|
import pathlib
|
||||||
|
|
||||||
|
from typing import Tuple, List
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class DataRequest:
|
||||||
|
name: str
|
||||||
|
source_path: pathlib.Path
|
||||||
|
num_train: int
|
||||||
|
num_test: int
|
||||||
|
num_valid: int
|
||||||
|
|
||||||
|
@property
|
||||||
|
def total_files(self):
|
||||||
|
return self.num_train + self.num_test + self.num_valid
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.source_path = pathlib.Path(self.source_path)
|
||||||
|
if not self.source_path.exists():
|
||||||
|
raise FileNotFoundError(f"{self.source_path} for DataRequest {self.name} does not exist")
|
||||||
|
|
||||||
|
if self.num_train < 0:
|
||||||
|
raise ValueError(f"Training sample count for DataRequest {self.name} must be non-negative")
|
||||||
|
if self.num_test < 0:
|
||||||
|
raise ValueError(f"Testing sample count for DataRequest {self.name} must be non-negative")
|
||||||
|
if self.num_valid < 0:
|
||||||
|
raise ValueError(f"Validation sample count for DataRequest {self.name} must be non-negative")
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class DatasetDescription:
|
||||||
|
name: str
|
||||||
|
version: Tuple[int, int]
|
||||||
|
save_to_dir: pathlib.Path
|
||||||
|
huggingface_user: str
|
||||||
|
data_requests: List[DataRequest]
|
||||||
|
|
||||||
|
@property
|
||||||
|
def code_dir(self):
|
||||||
|
return self.save_to_dir / self.name / "code"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def csv_dir(self):
|
||||||
|
return self.save_to_dir / self.name / "csv"
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.save_to_dir = pathlib.Path(self.save_to_dir)
|
||||||
|
self.version = tuple(self.version)
|
||||||
@@ -0,0 +1,3 @@
|
|||||||
|
from .create_code_dataset import transfer_and_compile_file
|
||||||
|
|
||||||
|
__all__ = ["transfer_and_compile_file"]
|
||||||
@@ -0,0 +1,216 @@
|
|||||||
|
import csv
|
||||||
|
import itertools
|
||||||
|
import logging
|
||||||
|
import multiprocessing
|
||||||
|
import pathlib
|
||||||
|
import re
|
||||||
|
import signal
|
||||||
|
from typing import Callable, Tuple
|
||||||
|
|
||||||
|
import tqdm
|
||||||
|
from pylingual.editable_bytecode import PYCFile
|
||||||
|
|
||||||
|
from pylingual.masking.ast_masker import DUMMY_DECORATOR
|
||||||
|
from pylingual.masking.model_disasm import fix_jump_targets
|
||||||
|
from .DatasetDescription import DataRequest
|
||||||
|
from pylingual.masking.model_disasm import create_global_masker, mask_source
|
||||||
|
|
||||||
|
bytecode_separator = " <SEP> "
|
||||||
|
source_seperator = " <SEP> "
|
||||||
|
CSV_SGMT_HEADER = ["source", "bytecode", "boundary", "file"]
|
||||||
|
CSV_STMT_HEADER = ["source", "bytecode", "file"]
|
||||||
|
|
||||||
|
|
||||||
|
def create_csv_dataset(code_dataset_path: pathlib.Path, csv_dataset_path: pathlib.Path, data_requests: list[DataRequest], logger: logging.Logger = None):
|
||||||
|
progress_bar = tqdm.tqdm(total=sum([request.total_files for request in data_requests]))
|
||||||
|
for split in ("train", "test", "valid"):
|
||||||
|
if logger:
|
||||||
|
logger.info(f"Converting the {split} split to CSV...")
|
||||||
|
write_csvs(code_dataset_path / split, csv_dataset_path / split, logger, progress_bar=progress_bar)
|
||||||
|
|
||||||
|
|
||||||
|
def write_csvs(source_path: pathlib.Path, csv_output_path: pathlib.Path, logger: logging.Logger = None, max_csv_rows: int = 30000, progress_bar: tqdm.tqdm = None):
|
||||||
|
# validate output directory
|
||||||
|
if csv_output_path.exists():
|
||||||
|
if not csv_output_path.is_dir():
|
||||||
|
raise OSError("CSV output path is not a directory")
|
||||||
|
else:
|
||||||
|
csv_output_path.mkdir(parents=True)
|
||||||
|
|
||||||
|
##### csv write wrappers to preserve csv row limit
|
||||||
|
|
||||||
|
def csv_writer(file_prefix: str, csv_header: list) -> Callable:
|
||||||
|
out_dir = csv_output_path.joinpath(file_prefix)
|
||||||
|
out_dir.mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
for csv_idx in itertools.count():
|
||||||
|
new_path = out_dir.joinpath(f"{file_prefix}_{csv_idx}.csv")
|
||||||
|
new_path.touch()
|
||||||
|
if logger:
|
||||||
|
logger.info(f"Creating new csv {new_path.resolve()}...")
|
||||||
|
with new_path.open(mode="w") as csv_file:
|
||||||
|
writer = csv.writer(csv_file)
|
||||||
|
writer.writerow(csv_header)
|
||||||
|
for writer in itertools.repeat(writer, max_csv_rows):
|
||||||
|
yield writer.writerow
|
||||||
|
|
||||||
|
segmentation_writer = csv_writer("segmentation", CSV_SGMT_HEADER)
|
||||||
|
statement_writer = csv_writer("statement", CSV_STMT_HEADER)
|
||||||
|
|
||||||
|
# create dirs
|
||||||
|
code_dirs = (child for child in source_path.iterdir() if child.is_dir())
|
||||||
|
|
||||||
|
def bytecode2csv_args():
|
||||||
|
for dir in code_dirs:
|
||||||
|
py_path = next(dir.glob("*.py"), None)
|
||||||
|
pyc_path = next(dir.glob("*.pyc"), None)
|
||||||
|
if None in (py_path, pyc_path):
|
||||||
|
logging.debug(f"PY or PYC file not found in {dir}")
|
||||||
|
continue
|
||||||
|
else:
|
||||||
|
yield (py_path, pyc_path)
|
||||||
|
|
||||||
|
num_fails = 0
|
||||||
|
with multiprocessing.Pool() as pool:
|
||||||
|
for result in pool.imap_unordered(bytecode2csv_exception_wrapper, bytecode2csv_args()):
|
||||||
|
if isinstance(result, Exception):
|
||||||
|
num_fails += 1
|
||||||
|
logger.debug(f"DIR: {dir}\nERR: {result}\nTYPE ERR: {type(result)}\n")
|
||||||
|
continue
|
||||||
|
|
||||||
|
(segmentation_rows, statement_rows) = result
|
||||||
|
for row, writerow in zip(segmentation_rows, segmentation_writer):
|
||||||
|
writerow(row)
|
||||||
|
for row, writerow in zip(statement_rows, statement_writer):
|
||||||
|
writerow(row)
|
||||||
|
|
||||||
|
if progress_bar:
|
||||||
|
progress_bar.update()
|
||||||
|
progress_bar.set_postfix({"num_fails": num_fails})
|
||||||
|
|
||||||
|
logger.info(f"NUMBER OF FAILS !!! {num_fails}")
|
||||||
|
|
||||||
|
|
||||||
|
def timeout_handler(signum, frame):
|
||||||
|
raise TimeoutError()
|
||||||
|
|
||||||
|
|
||||||
|
def bytecode2csv_exception_wrapper(paths=Tuple[pathlib.Path, pathlib.Path]) -> Tuple[list, list] | Exception:
|
||||||
|
signal.signal(signal.SIGALRM, timeout_handler)
|
||||||
|
try:
|
||||||
|
signal.alarm(30) # set 30 second timeout
|
||||||
|
results = bytecode2csv(*paths)
|
||||||
|
signal.alarm(0) # success; disable timer
|
||||||
|
return results
|
||||||
|
except Exception as error:
|
||||||
|
signal.alarm(0) # disable timer in case another exception triggered the fail
|
||||||
|
return Exception(f"{type(error)}: {error} in file {paths}")
|
||||||
|
|
||||||
|
|
||||||
|
def bytecode2csv(py_path: pathlib.Path, pyc_path: pathlib.Path) -> tuple[list, list]:
|
||||||
|
"""Creates segmentation and statement csv rows for given bytecode and source file"""
|
||||||
|
segmentation_rows = []
|
||||||
|
statement_rows = []
|
||||||
|
|
||||||
|
pyc = PYCFile(str(pyc_path.resolve()))
|
||||||
|
if pyc.version == (3, 10):
|
||||||
|
pyc.replace_duplicated_returns10(py_path.read_text().split("\n"))
|
||||||
|
elif pyc.version == (3, 12):
|
||||||
|
pyc.replace_duplicated_returns12(py_path.read_text().split("\n"))
|
||||||
|
global_masker = create_global_masker(pyc)
|
||||||
|
|
||||||
|
masked_source_text = mask_source(py_path, global_masker, pyc.version)
|
||||||
|
masked_source_lines = masked_source_text.split("\n")
|
||||||
|
|
||||||
|
# filter out dummy decorators added in <= 3.7
|
||||||
|
dummy_lnos = []
|
||||||
|
if pyc.version <= (3, 7):
|
||||||
|
# remove dummy decorators from bytecode'
|
||||||
|
pyc._patch_dummy_decorator(dummy_decorator_name=DUMMY_DECORATOR)
|
||||||
|
try: # if no functions are in source, then dummy will not exist
|
||||||
|
dummy_decorator_line = f"@{global_masker.mask(DUMMY_DECORATOR)}"
|
||||||
|
except KeyError:
|
||||||
|
dummy_decorator_line = None
|
||||||
|
dummy_lnos = [lno + 1 for lno, source in enumerate(masked_source_lines) if source.strip() == dummy_decorator_line]
|
||||||
|
|
||||||
|
seen_lines = set()
|
||||||
|
|
||||||
|
# create rows for each bytecode
|
||||||
|
for bc in pyc.iter_bytecodes():
|
||||||
|
# we ignore comprehensions, hoisted later
|
||||||
|
if bc.is_comprehension:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# attempt to filter lines
|
||||||
|
lno_insts = bc.get_lno_insts(previously_seen_lines=seen_lines)
|
||||||
|
|
||||||
|
# create line num : model disasm view of insts
|
||||||
|
lno_model_view_insts = {lno: [global_masker.get_model_view(inst) for inst in line_insts] for lno, line_insts in lno_insts.items()}
|
||||||
|
seen_lines.update(lno_model_view_insts.keys())
|
||||||
|
|
||||||
|
# segment source
|
||||||
|
if pyc.version <= (3, 7):
|
||||||
|
segmented_source_lines = []
|
||||||
|
for line_num in lno_model_view_insts:
|
||||||
|
if not line_num:
|
||||||
|
segmented_source_lines.append("")
|
||||||
|
elif line_num in dummy_lnos:
|
||||||
|
segmented_source_lines.append(masked_source_lines[line_num].strip())
|
||||||
|
else:
|
||||||
|
segmented_source_lines.append(masked_source_lines[line_num - 1].strip())
|
||||||
|
else:
|
||||||
|
segmented_source_lines = [masked_source_lines[line_num - 1].strip() if line_num else "" for line_num in lno_model_view_insts.keys()] # -1 to convert from line num to index in array
|
||||||
|
|
||||||
|
model_disasm_text = bytecode_separator.join(val for val in itertools.chain(*lno_model_view_insts.values()))
|
||||||
|
|
||||||
|
if len(segmented_source_lines) != len(lno_model_view_insts):
|
||||||
|
raise ValueError("Length mismatch between segmented source and segmented bytecodes")
|
||||||
|
|
||||||
|
# create bytecode segmentation
|
||||||
|
boundaries = []
|
||||||
|
for bc_line in lno_model_view_insts.values():
|
||||||
|
if len(bc_line) == 1:
|
||||||
|
bounds = "B"
|
||||||
|
elif len(bc_line) >= 2:
|
||||||
|
bounds = "B" + "I" * (len(bc_line) - 2) + "E"
|
||||||
|
else:
|
||||||
|
raise ValueError("Unexpected amount of bytecodes segmented into a line")
|
||||||
|
boundaries.extend(list(bounds))
|
||||||
|
|
||||||
|
# append rows
|
||||||
|
segmentation_rows.append([source_seperator.join(segmented_source_lines), model_disasm_text, boundaries, str(py_path)])
|
||||||
|
for segmented_source, bytecodes in zip(segmented_source_lines, lno_model_view_insts.values()):
|
||||||
|
# skip empty lines
|
||||||
|
if not segmented_source or segmented_source == "None":
|
||||||
|
continue
|
||||||
|
# skip fillers
|
||||||
|
if segmented_source in ("pass", "...") and ("RETURN_VALUE" in bytecodes or "RETURN_CONST , None" in bytecodes):
|
||||||
|
continue
|
||||||
|
# skip string-only lines that aren't docstrings
|
||||||
|
if (segmented_source.startswith("'") or segmented_source.startswith('"')) and not any("__doc__" in b for b in bytecodes):
|
||||||
|
continue
|
||||||
|
if segmented_source.startswith("elif "):
|
||||||
|
segmented_source = segmented_source[2:]
|
||||||
|
|
||||||
|
joined_bytecode = bytecode_separator.join(bytecodes)
|
||||||
|
|
||||||
|
# DUCT-TAPE; skip samples where model has to guess masks
|
||||||
|
source_masks = set(re.findall(r"<mask_\d+>", segmented_source))
|
||||||
|
bytecode_masks = set(re.findall(r"<mask_\d+>", joined_bytecode))
|
||||||
|
if not source_masks <= bytecode_masks:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# normalize source mask order for statements
|
||||||
|
# replace mask values to start at 0 and count up
|
||||||
|
mask_regex = re.compile(r"(?<=<mask_)\d+(?=>)")
|
||||||
|
masks = mask_regex.findall(joined_bytecode)
|
||||||
|
mask_order = [x for i, x in enumerate(masks) if masks.index(x) == i]
|
||||||
|
normalized_mask_bytecode = mask_regex.sub(lambda x: str(mask_order.index(x.group(0))), joined_bytecode)
|
||||||
|
normalized_mask_source = mask_regex.sub(lambda x: str(mask_order.index(x.group(0))), segmented_source)
|
||||||
|
|
||||||
|
# normalize jump targets
|
||||||
|
normalized_mask_bytecode = fix_jump_targets(normalized_mask_bytecode)
|
||||||
|
|
||||||
|
statement_rows.append([normalized_mask_source, normalized_mask_bytecode, str(py_path)])
|
||||||
|
|
||||||
|
return (segmentation_rows, statement_rows)
|
||||||
@@ -0,0 +1,114 @@
|
|||||||
|
import itertools
|
||||||
|
import logging
|
||||||
|
import multiprocessing
|
||||||
|
import pathlib
|
||||||
|
import random
|
||||||
|
from typing import List, Optional, Set, Tuple
|
||||||
|
|
||||||
|
import tqdm
|
||||||
|
|
||||||
|
from .DatasetDescription import DataRequest
|
||||||
|
from pylingual.utils.generate_bytecode import compile_version
|
||||||
|
from .normalize_source import normalize_source
|
||||||
|
from pylingual.masking.ast_masker import add_dummy_decorators
|
||||||
|
|
||||||
|
|
||||||
|
def transfer_and_compile_file(
|
||||||
|
original_file: pathlib.Path,
|
||||||
|
destination_file: pathlib.Path,
|
||||||
|
version: Tuple[int, int],
|
||||||
|
) -> Optional[Exception]:
|
||||||
|
# copy over normalized source file
|
||||||
|
try:
|
||||||
|
normalized_source = normalize_source(original_file.read_text(), version=version, replace_docstrings=True)
|
||||||
|
|
||||||
|
if version[:2] <= (3, 7):
|
||||||
|
normalized_source = add_dummy_decorators(normalized_source)
|
||||||
|
except Exception as err:
|
||||||
|
return err
|
||||||
|
|
||||||
|
destination_file.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
destination_file.write_text(normalized_source)
|
||||||
|
|
||||||
|
# compile the copied file with the given version
|
||||||
|
try:
|
||||||
|
compile_version(
|
||||||
|
destination_file.resolve(),
|
||||||
|
destination_file.with_suffix(".pyc").resolve(),
|
||||||
|
version,
|
||||||
|
)
|
||||||
|
except Exception as err:
|
||||||
|
return err
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def star_transfer_and_compile_file(args) -> Optional[Exception]:
|
||||||
|
return transfer_and_compile_file(*args)
|
||||||
|
|
||||||
|
|
||||||
|
# samples num_files files from the given directory
|
||||||
|
# expects the directory to have the structure
|
||||||
|
# source_dir -> identifier -> file.py
|
||||||
|
def sample_directory_splits(
|
||||||
|
data_request: DataRequest,
|
||||||
|
) -> Tuple[List[pathlib.Path], List[pathlib.Path], List[pathlib.Path]]:
|
||||||
|
all_files: Set[pathlib.Path] = set()
|
||||||
|
for identifier in data_request.source_path.iterdir():
|
||||||
|
source_file = next(identifier.glob("*.py"), None) # get the first python file from the identifier
|
||||||
|
if source_file is not None:
|
||||||
|
all_files.add(source_file)
|
||||||
|
|
||||||
|
# sample batches until we have enough files to satisfy the data requests
|
||||||
|
# this avoids running expensive tests on unsampled files
|
||||||
|
clean_sample: Set[pathlib.Path] = set()
|
||||||
|
while len(clean_sample) < data_request.total_files:
|
||||||
|
remaining_files = data_request.total_files - len(clean_sample)
|
||||||
|
sample_batch = random.sample(list(all_files), k=remaining_files)
|
||||||
|
# add the acceptable files to the sample and remove them from the population
|
||||||
|
to_add = set(candidate for candidate in sample_batch if candidate is not None)
|
||||||
|
clean_sample.update(to_add)
|
||||||
|
all_files -= to_add
|
||||||
|
|
||||||
|
full_sample = iter(clean_sample)
|
||||||
|
|
||||||
|
train = list(itertools.islice(full_sample, data_request.num_train))
|
||||||
|
test = list(itertools.islice(full_sample, data_request.num_test))
|
||||||
|
valid = list(itertools.islice(full_sample, data_request.num_valid))
|
||||||
|
|
||||||
|
return train, test, valid
|
||||||
|
|
||||||
|
|
||||||
|
def prepare_single_directory_transfer_args(data_request: DataRequest, target_dir: pathlib.Path) -> List[Tuple[pathlib.Path, pathlib.Path]]:
|
||||||
|
train, test, valid = sample_directory_splits(data_request)
|
||||||
|
|
||||||
|
transfer_args = []
|
||||||
|
for split_name, split_files in zip(("train", "test", "valid"), (train, test, valid)):
|
||||||
|
for source_file in split_files:
|
||||||
|
target_file = target_dir / split_name / f"{data_request.name}-{source_file.parent.name}" / source_file.name
|
||||||
|
transfer_args.append((source_file, target_file))
|
||||||
|
|
||||||
|
return transfer_args
|
||||||
|
|
||||||
|
|
||||||
|
# takes a dict of {<source directory>: (num_train, num_test, num_valid)} and a target directory
|
||||||
|
# makes train, test, and split directories in the target directory with the normalized source files
|
||||||
|
def create_code_dataset(
|
||||||
|
data_requests: List[DataRequest],
|
||||||
|
target_dir: pathlib.Path,
|
||||||
|
version: Tuple[int, int],
|
||||||
|
logger: logging.Logger,
|
||||||
|
):
|
||||||
|
with multiprocessing.Pool() as pool:
|
||||||
|
# prepare a list of file transfers to execute
|
||||||
|
logger.info(f"Sampling {', '.join(str(req.source_path.resolve()) for req in data_requests)}...")
|
||||||
|
transfer_arg_lists = pool.starmap(
|
||||||
|
prepare_single_directory_transfer_args,
|
||||||
|
zip(data_requests, itertools.repeat(target_dir)),
|
||||||
|
)
|
||||||
|
# execute the file transfers
|
||||||
|
versioned_transfer_arg_lists = [(source_file, target_file, version) for (source_file, target_file) in itertools.chain(*transfer_arg_lists)]
|
||||||
|
logger.info(f"Normalizing and Compiling {len(versioned_transfer_arg_lists)} files...")
|
||||||
|
for error in tqdm.tqdm(pool.imap_unordered(star_transfer_and_compile_file, versioned_transfer_arg_lists), total=len(versioned_transfer_arg_lists)):
|
||||||
|
if error is not None:
|
||||||
|
logger.debug(error)
|
||||||
@@ -0,0 +1,67 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
|
import ast
|
||||||
|
import pathlib
|
||||||
|
import sys
|
||||||
|
from typing import Tuple
|
||||||
|
|
||||||
|
|
||||||
|
def version_str_to_tuple(version_str: str) -> tuple[int, int]:
|
||||||
|
# a version string is a string like 3.9.2
|
||||||
|
versions = [int(version) for version in version_str.split(".")]
|
||||||
|
return tuple(versions[:2])
|
||||||
|
|
||||||
|
|
||||||
|
# must be run in python 3.9 or later for ast.unparse() support
|
||||||
|
# version defaults to whatever version this script is running in; needs to be set explicitly for backwards compatibility
|
||||||
|
# ast only supports versions 3.4 and later
|
||||||
|
def normalize_source(
|
||||||
|
source: str,
|
||||||
|
version: Tuple[int, int] = sys.version_info[0:2],
|
||||||
|
replace_docstrings=False,
|
||||||
|
) -> str:
|
||||||
|
"""
|
||||||
|
Parse the source code into an AST, then convert back to source.
|
||||||
|
This has the following normalizing effects:
|
||||||
|
1. whitespace is set according to the PEP standard
|
||||||
|
2. each statement is on exactly one line
|
||||||
|
3. # comments are removed (note: docstrings are not removed)
|
||||||
|
|
||||||
|
:param str source: The source code to normalize
|
||||||
|
:param tuple version: The (Major, Minor) version of python to parse with; must be at least (3, 4); defaults to
|
||||||
|
same version as this script
|
||||||
|
:param bool replace_docstrings: Replace all docstrings with 'pass'
|
||||||
|
"""
|
||||||
|
tree = ast.parse(source, feature_version=version)
|
||||||
|
if replace_docstrings:
|
||||||
|
for node in ast.walk(tree):
|
||||||
|
if isinstance(node, ast.Expr) and isinstance(node.value, ast.Str):
|
||||||
|
node.value.s = "pass"
|
||||||
|
return ast.unparse(tree)
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_source_file(
|
||||||
|
source_file_path: str,
|
||||||
|
cleaned_suffix: str = "-cleaned",
|
||||||
|
version: tuple[int, int] = sys.version_info[0:2],
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Normalizes the source code in a given file, then saves it to a '-cleaned' version in the same directory
|
||||||
|
|
||||||
|
:param str source_file_path: The absolute or relative path to the source .py file
|
||||||
|
:param str cleaned_suffix: The suffix to add to the cleaned file, typically left as default
|
||||||
|
:param tuple version: The (Major, Minor) version of python to parse with; must be at least (3, 4); defaults to
|
||||||
|
same version as this script
|
||||||
|
"""
|
||||||
|
|
||||||
|
# add the cleaned_suffix to the output_path
|
||||||
|
input_path = pathlib.Path(source_file_path).resolve()
|
||||||
|
output_path = input_path.with_stem(f"{input_path.stem}{cleaned_suffix}")
|
||||||
|
|
||||||
|
with open(input_path, "r") as source_file:
|
||||||
|
normalized_source = normalize_source(source_file.read(), version=version)
|
||||||
|
|
||||||
|
with open(output_path, "w") as cleaned_file:
|
||||||
|
cleaned_file.write(normalized_source)
|
||||||
|
|
||||||
|
return output_path
|
||||||
@@ -0,0 +1,60 @@
|
|||||||
|
from io import BytesIO
|
||||||
|
from typing import Dict, List, Literal
|
||||||
|
|
||||||
|
from datasets import load_dataset
|
||||||
|
from huggingface_hub import HfApi
|
||||||
|
|
||||||
|
from .DatasetDescription import DatasetDescription
|
||||||
|
|
||||||
|
LOCAL_DATASET = Dict[Literal["train", "test", "valid"], List[str]]
|
||||||
|
|
||||||
|
|
||||||
|
def upload_single_dataset(data_files: LOCAL_DATASET, dataset_name: str, dataset_card: str):
|
||||||
|
local_datasets = load_dataset("csv", data_files=data_files)
|
||||||
|
local_datasets.push_to_hub(dataset_name, private=True)
|
||||||
|
|
||||||
|
dataset_card_with_stats = dataset_card + f"\n\nDataset Statistics:\n\n```\n{local_datasets}\n```"
|
||||||
|
|
||||||
|
api = HfApi()
|
||||||
|
api.upload_file(
|
||||||
|
path_or_fileobj=BytesIO(bytes(dataset_card_with_stats, "utf-8")),
|
||||||
|
path_in_repo="README.md",
|
||||||
|
repo_id=dataset_name,
|
||||||
|
repo_type="dataset",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def upload_dataset_to_huggingface(dataset_description: DatasetDescription):
|
||||||
|
formatted_data_requests = "\n".join(f"{str(req.source_path.resolve())}: (train: {req.num_train}, test: {req.num_test}, valid: {req.num_valid})" for req in dataset_description.data_requests)
|
||||||
|
dataset_card = f"""
|
||||||
|
# {dataset_description.name}
|
||||||
|
|
||||||
|
Created by the Syssec team @ UTD
|
||||||
|
|
||||||
|
Dataset Composition:
|
||||||
|
|
||||||
|
```
|
||||||
|
{formatted_data_requests}
|
||||||
|
```
|
||||||
|
|
||||||
|
Python version: `{".".join(map(str, dataset_description.version))}`
|
||||||
|
"""
|
||||||
|
|
||||||
|
splits: List[Literal["train", "test", "valid"]] = [
|
||||||
|
"train",
|
||||||
|
"test",
|
||||||
|
"valid",
|
||||||
|
]
|
||||||
|
|
||||||
|
# collect data files
|
||||||
|
segmentation_data_files: LOCAL_DATASET = {}
|
||||||
|
statement_data_files: LOCAL_DATASET = {}
|
||||||
|
for split in splits:
|
||||||
|
segmentation_data_files[split] = [str(path.resolve()) for path in (dataset_description.csv_dir / split / "segmentation").glob("*.csv")]
|
||||||
|
statement_data_files[split] = [str(path.resolve()) for path in (dataset_description.csv_dir / split / "statement").glob("*.csv")]
|
||||||
|
|
||||||
|
# upload datasets
|
||||||
|
segmentation_dataset_name = f"{dataset_description.huggingface_user}/segmentation-{dataset_description.name}"
|
||||||
|
upload_single_dataset(segmentation_data_files, segmentation_dataset_name, dataset_card)
|
||||||
|
statement_dataset_name = f"{dataset_description.huggingface_user}/statement-{dataset_description.name}"
|
||||||
|
upload_single_dataset(statement_data_files, statement_dataset_name, dataset_card)
|
||||||
@@ -0,0 +1,66 @@
|
|||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import pathlib
|
||||||
|
from typing import Union
|
||||||
|
import click
|
||||||
|
|
||||||
|
from dataset_generation.bytecode2csv import create_csv_dataset
|
||||||
|
from dataset_generation.create_code_dataset import create_code_dataset
|
||||||
|
from dataset_generation.DatasetDescription import DataRequest, DatasetDescription
|
||||||
|
from dataset_generation.upload_raw_dataset import upload_dataset_to_huggingface
|
||||||
|
from pylingual.utils.get_logger import get_logger
|
||||||
|
|
||||||
|
|
||||||
|
def get_dataset_description_from_arg_json(json_path: str, logger: Union[logging.Logger, None] = None) -> DatasetDescription:
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
|
||||||
|
if not json_file_path.exists():
|
||||||
|
raise FileNotFoundError(f"{json_file_path} does not exist")
|
||||||
|
|
||||||
|
if logger:
|
||||||
|
logger.info(f"Loading dataset description from {json_file_path}...")
|
||||||
|
|
||||||
|
with json_file_path.open() as json_file:
|
||||||
|
dataset_description_dict = json.load(json_file)
|
||||||
|
|
||||||
|
dataset_description_dict["data_requests"] = [DataRequest(**d) for d in dataset_description_dict["data_requests"]]
|
||||||
|
return DatasetDescription(**dataset_description_dict)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Samples, splits, processes, and uploads a given dataset described by JSON.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
logger = get_logger("prepare-dataset")
|
||||||
|
|
||||||
|
dataset_description = get_dataset_description_from_arg_json(json_path, logger)
|
||||||
|
logger.debug(dataset_description)
|
||||||
|
|
||||||
|
if dataset_description.code_dir.exists():
|
||||||
|
raise FileExistsError(f"{dataset_description.code_dir} already exists! The dataset name is probably already taken.")
|
||||||
|
|
||||||
|
logger.info("Creating code dataset...")
|
||||||
|
if not (dataset_description.data_requests and dataset_description.code_dir and dataset_description.version):
|
||||||
|
logger.error("Dataset description is missing required fields")
|
||||||
|
exit(1)
|
||||||
|
create_code_dataset(
|
||||||
|
dataset_description.data_requests,
|
||||||
|
dataset_description.code_dir,
|
||||||
|
dataset_description.version,
|
||||||
|
logger,
|
||||||
|
)
|
||||||
|
|
||||||
|
# create csv dataset
|
||||||
|
logger.info("Converting code dataset to csv...")
|
||||||
|
create_csv_dataset(
|
||||||
|
dataset_description.code_dir,
|
||||||
|
dataset_description.csv_dir,
|
||||||
|
dataset_description.data_requests,
|
||||||
|
logger,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.info(f"Uploading {dataset_description.name} to HuggingFace...")
|
||||||
|
upload_dataset_to_huggingface(dataset_description)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,24 @@
|
|||||||
|
{
|
||||||
|
"name": "sample_dataset_name",
|
||||||
|
"version": [3, 6],
|
||||||
|
"save_to_dir": "./save_dir/",
|
||||||
|
"huggingface_user": "sample_user",
|
||||||
|
|
||||||
|
"data_requests":
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"name": "dataset",
|
||||||
|
"source_path": "./dataset",
|
||||||
|
"num_train": 200,
|
||||||
|
"num_test": 200,
|
||||||
|
"num_valid": 200
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "dataset2",
|
||||||
|
"source_path": "./dataset2",
|
||||||
|
"num_train": 200,
|
||||||
|
"num_test": 200,
|
||||||
|
"num_valid": 200
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
{
|
||||||
|
"base_repo_name": "sample_user/sample_segmenter_name",
|
||||||
|
"dataset_repo_name": "sample_user/segmentation-sample_dataset_name",
|
||||||
|
"pretrained_mlm_repo_name": "",
|
||||||
|
"cache_dir": "./cache-dir/",
|
||||||
|
"max_token_length": 512,
|
||||||
|
"dataset_percentage": 100,
|
||||||
|
"mlm_training_parameters": {
|
||||||
|
"batch_size": 48,
|
||||||
|
"epochs": 2,
|
||||||
|
"learning_rate": 5e-5
|
||||||
|
},
|
||||||
|
"segmentation_training_parameters": {
|
||||||
|
"batch_size": 48,
|
||||||
|
"epochs": 2,
|
||||||
|
"learning_rate": 2e-5
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,16 @@
|
|||||||
|
{
|
||||||
|
"base_repo_name": "sample_user/sample_name",
|
||||||
|
"dataset_repo_name": "sample_user/statement-sample_dataset_name",
|
||||||
|
"tokenizer_repo_name": "sample_user/sample_name-tok",
|
||||||
|
"pretrained_seq2seq_repo_name": "Salesforce/codet5-base",
|
||||||
|
"cache_dir": "./cache-dir/",
|
||||||
|
"max_token_length": 256,
|
||||||
|
"dataset_percentage": 100,
|
||||||
|
"do_eval": true,
|
||||||
|
"fp16": true,
|
||||||
|
"statement_training_parameters": {
|
||||||
|
"batch_size": 24,
|
||||||
|
"epochs": 2,
|
||||||
|
"learning_rate": 2e-5
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,74 @@
|
|||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import pathlib
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TrainingParameters:
|
||||||
|
batch_size: int
|
||||||
|
epochs: int
|
||||||
|
learning_rate: float
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SegmentationConfiguration:
|
||||||
|
base_repo_name: str
|
||||||
|
dataset_repo_name: str
|
||||||
|
pretrained_mlm_repo_name: str
|
||||||
|
cache_dir: pathlib.Path
|
||||||
|
max_token_length: int
|
||||||
|
dataset_percentage: int
|
||||||
|
mlm_training_parameters: TrainingParameters
|
||||||
|
segmentation_training_parameters: TrainingParameters
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tokenizer_repo_name(self):
|
||||||
|
return self.base_repo_name + "-tokenizer"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tokenizer_json_path(self):
|
||||||
|
return self.cache_dir / "tokenizers" / self.tokenizer_repo_name / "tokenizer.json"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tokenized_dataset_repo_name(self):
|
||||||
|
return self.dataset_repo_name + "-tokenized"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def mlm_repo_name(self):
|
||||||
|
return self.base_repo_name + "-mlm"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def mlm_dir(self):
|
||||||
|
return self.cache_dir / "models" / self.mlm_repo_name
|
||||||
|
|
||||||
|
@property
|
||||||
|
def segmenter_repo_name(self):
|
||||||
|
return self.base_repo_name + "-segmenter"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def segmenter_dir(self):
|
||||||
|
return self.cache_dir / "models" / self.segmenter_repo_name
|
||||||
|
|
||||||
|
@property
|
||||||
|
def dataset_dir(self):
|
||||||
|
return self.cache_dir / "datasets" / self.dataset_repo_name
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.cache_dir = pathlib.Path(self.cache_dir)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_segmentation_config_json(json_file_path: pathlib.Path, logger: Optional[logging.Logger] = None) -> SegmentationConfiguration:
|
||||||
|
if not json_file_path.exists():
|
||||||
|
raise FileNotFoundError(f"{json_file_path} does not exist")
|
||||||
|
|
||||||
|
if logger:
|
||||||
|
logger.info(f"Loading model description from {json_file_path}...")
|
||||||
|
|
||||||
|
with json_file_path.open() as json_file:
|
||||||
|
segmentation_config_dict = json.load(json_file)
|
||||||
|
|
||||||
|
segmentation_config_dict["mlm_training_parameters"] = TrainingParameters(**segmentation_config_dict["mlm_training_parameters"])
|
||||||
|
segmentation_config_dict["segmentation_training_parameters"] = TrainingParameters(**segmentation_config_dict["segmentation_training_parameters"])
|
||||||
|
return SegmentationConfiguration(**segmentation_config_dict)
|
||||||
@@ -0,0 +1,152 @@
|
|||||||
|
import ast
|
||||||
|
import functools
|
||||||
|
import os
|
||||||
|
import pathlib
|
||||||
|
import click
|
||||||
|
|
||||||
|
from datasets import load_dataset
|
||||||
|
from huggingface_hub import hf_hub_download
|
||||||
|
from SegmentationConfiguration import SegmentationConfiguration, parse_segmentation_config_json
|
||||||
|
from pylingual.segmentation.sliding_window import sliding_window
|
||||||
|
from transformers import PreTrainedTokenizerFast
|
||||||
|
|
||||||
|
bytecode_separator = " <SEP> "
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenizer(tokenizer_repo_name: str, cache_dir: pathlib.Path) -> PreTrainedTokenizerFast:
|
||||||
|
tokenizer_dir = cache_dir / "tokenizers" / tokenizer_repo_name
|
||||||
|
|
||||||
|
tokenizer_file = hf_hub_download(repo_id=tokenizer_repo_name, filename="tokenizer.json", token=True, cache_dir=str(tokenizer_dir))
|
||||||
|
tokenizer = PreTrainedTokenizerFast(
|
||||||
|
tokenizer_file=tokenizer_file,
|
||||||
|
unk_token="[UNK]",
|
||||||
|
pad_token="[PAD]",
|
||||||
|
cls_token="[CLS]",
|
||||||
|
sep_token="[SEP]",
|
||||||
|
mask_token="[MASK]",
|
||||||
|
)
|
||||||
|
|
||||||
|
return tokenizer
|
||||||
|
|
||||||
|
|
||||||
|
# we need to make sure we align all the labels with the proper words.
|
||||||
|
def align_labels_with_tokens(labels, word_ids):
|
||||||
|
label_names = ["B", "I", "E"]
|
||||||
|
id2label = {str(i): label for i, label in enumerate(label_names)}
|
||||||
|
label2id = {v: k for k, v in id2label.items()}
|
||||||
|
|
||||||
|
new_labels = []
|
||||||
|
current_word = None
|
||||||
|
for word_id in word_ids:
|
||||||
|
if word_id != current_word:
|
||||||
|
# Start of a new word!
|
||||||
|
current_word = word_id
|
||||||
|
label = -100 if word_id is None else int(label2id[labels[word_id]])
|
||||||
|
new_labels.append(label)
|
||||||
|
elif word_id is None:
|
||||||
|
# Special token
|
||||||
|
new_labels.append(-100)
|
||||||
|
else:
|
||||||
|
# Same word as previous token
|
||||||
|
label = int(label2id[labels[word_id]])
|
||||||
|
new_labels.append(label)
|
||||||
|
return new_labels
|
||||||
|
|
||||||
|
|
||||||
|
# the process function used for tokenize the dataset
|
||||||
|
def tokenize_and_align_labels(tokenizer: PreTrainedTokenizerFast, max_length: int, examples):
|
||||||
|
MAX_WINDOW_LENGTH = 512
|
||||||
|
STEP_SIZE = 128
|
||||||
|
|
||||||
|
# parse the strings into lists to better work with the bytecode and boundaries
|
||||||
|
parsed_bc = [(codeobj.split(" <SEP> "), ast.literal_eval(bounds)) for codeobj, bounds in zip(examples["bytecode"], examples["boundary"])]
|
||||||
|
|
||||||
|
codeobj_tokens = []
|
||||||
|
|
||||||
|
# count the tokens for each bytecode instruction in a codeobj
|
||||||
|
for codeobj, bounds in parsed_bc:
|
||||||
|
token_list = []
|
||||||
|
|
||||||
|
for bc, bounds in zip(codeobj, bounds):
|
||||||
|
token_list.append(((bc, bounds), len(tokenizer(bc)[0])))
|
||||||
|
|
||||||
|
codeobj_tokens.append(token_list)
|
||||||
|
|
||||||
|
windows = [sliding_window(codeobj, MAX_WINDOW_LENGTH, STEP_SIZE) for codeobj in codeobj_tokens]
|
||||||
|
|
||||||
|
# remake examples using our windows
|
||||||
|
examples["boundary"] = []
|
||||||
|
examples["bytecode"] = []
|
||||||
|
|
||||||
|
# go through each window
|
||||||
|
for window in windows:
|
||||||
|
for item in window:
|
||||||
|
# where we will temporarily store our bytecode and bounds
|
||||||
|
bytecode = []
|
||||||
|
bounds = []
|
||||||
|
|
||||||
|
for bc in item[0]:
|
||||||
|
bytecode.append(bc[0])
|
||||||
|
bounds.append(bc[1])
|
||||||
|
|
||||||
|
# append it into examples
|
||||||
|
examples["bytecode"].append(bytecode_separator.join(bytecode))
|
||||||
|
examples["boundary"].append(str(bounds))
|
||||||
|
|
||||||
|
tokenized_inputs = tokenizer(
|
||||||
|
examples["bytecode"],
|
||||||
|
truncation=True,
|
||||||
|
max_length=max_length,
|
||||||
|
)
|
||||||
|
|
||||||
|
all_labels = examples["boundary"]
|
||||||
|
new_labels = []
|
||||||
|
for i, labels in enumerate(all_labels):
|
||||||
|
labels = labels.replace("'", "").strip("][").split(", ")
|
||||||
|
word_ids = tokenized_inputs.word_ids(i)
|
||||||
|
labels_len = len(labels)
|
||||||
|
max_word_id = word_ids[-2]
|
||||||
|
# for those data might cause error due to the incorrect tokenization, we fix the data exceed-length issue and
|
||||||
|
# leave them here as some noisy data.
|
||||||
|
if max_word_id >= labels_len:
|
||||||
|
new_labels.append([-100] * max_word_id)
|
||||||
|
else:
|
||||||
|
new_labels.append(align_labels_with_tokens(labels, word_ids))
|
||||||
|
|
||||||
|
tokenized_inputs["labels"] = new_labels
|
||||||
|
|
||||||
|
return tokenized_inputs
|
||||||
|
|
||||||
|
|
||||||
|
def tokenize_segmentation_dataset(config: SegmentationConfiguration):
|
||||||
|
raw_dataset = load_dataset(config.dataset_repo_name, token=True, cache_dir=str(config.dataset_dir))
|
||||||
|
|
||||||
|
tokenizer = load_tokenizer(config.tokenizer_repo_name, config.cache_dir)
|
||||||
|
prepped_tokenize_and_align_labels = functools.partial(tokenize_and_align_labels, tokenizer, config.max_token_length)
|
||||||
|
|
||||||
|
# tokenize input dataset
|
||||||
|
column_names = raw_dataset["train"].column_names
|
||||||
|
tokenized_datasets = raw_dataset.map(
|
||||||
|
prepped_tokenize_and_align_labels,
|
||||||
|
batched=True,
|
||||||
|
remove_columns=column_names,
|
||||||
|
num_proc=os.cpu_count(),
|
||||||
|
desc="Tokenizing datasets",
|
||||||
|
)
|
||||||
|
|
||||||
|
tokenized_datasets.push_to_hub(
|
||||||
|
config.tokenized_dataset_repo_name,
|
||||||
|
private=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Script to tokenize the segmentation dataset given a segmentation json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
segmentation_config = parse_segmentation_config_json(json_file_path)
|
||||||
|
tokenize_segmentation_dataset(segmentation_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,195 @@
|
|||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import pathlib
|
||||||
|
import click
|
||||||
|
|
||||||
|
from datasets import load_dataset
|
||||||
|
from huggingface_hub import hf_hub_download, repo_exists
|
||||||
|
from SegmentationConfiguration import SegmentationConfiguration, parse_segmentation_config_json
|
||||||
|
from transformers import AutoModelForMaskedLM, DataCollatorForLanguageModeling, PreTrainedTokenizerFast, RobertaConfig, RobertaForMaskedLM, Trainer, TrainingArguments
|
||||||
|
|
||||||
|
from pylingual.segmentation.sliding_window import sliding_window
|
||||||
|
|
||||||
|
bytecode_separator = " <SEP> "
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenizer(tokenizer_repo_name: str, cache_dir: pathlib.Path) -> PreTrainedTokenizerFast:
|
||||||
|
tokenizer_dir = cache_dir / "tokenizers" / tokenizer_repo_name
|
||||||
|
|
||||||
|
tokenizer_file = hf_hub_download(
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
filename="tokenizer.json",
|
||||||
|
token=True,
|
||||||
|
cache_dir=str(tokenizer_dir),
|
||||||
|
)
|
||||||
|
tokenizer = PreTrainedTokenizerFast(
|
||||||
|
tokenizer_file=tokenizer_file,
|
||||||
|
unk_token="[UNK]",
|
||||||
|
pad_token="[PAD]",
|
||||||
|
cls_token="[CLS]",
|
||||||
|
sep_token="[SEP]",
|
||||||
|
mask_token="[MASK]",
|
||||||
|
)
|
||||||
|
|
||||||
|
return tokenizer
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenized_train_dataset(
|
||||||
|
dataset_repo_name: str,
|
||||||
|
tokenizer: PreTrainedTokenizerFast,
|
||||||
|
max_length: int,
|
||||||
|
cache_dir: pathlib.Path,
|
||||||
|
):
|
||||||
|
dataset_dir = cache_dir / "datasets" / dataset_repo_name
|
||||||
|
raw_dataset = load_dataset(dataset_repo_name, token=True, cache_dir=dataset_dir, split="train")
|
||||||
|
|
||||||
|
# tokenize the input data
|
||||||
|
column_names = raw_dataset.column_names
|
||||||
|
|
||||||
|
def tokenize(examples):
|
||||||
|
# sliding window compatibility
|
||||||
|
MAX_WINDOW_LENGTH = 512
|
||||||
|
STEP_SIZE = 128
|
||||||
|
|
||||||
|
# parse the strings into lists to better work with the bytecode and boundaries
|
||||||
|
parsed_bc = [codeobj.split(" <SEP> ") for codeobj in examples["bytecode"]]
|
||||||
|
|
||||||
|
codeobj_tokens = []
|
||||||
|
|
||||||
|
# count the tokens for each bytecode instruction in a codeobj
|
||||||
|
for codeobj in parsed_bc:
|
||||||
|
token_list = []
|
||||||
|
|
||||||
|
for bytecode in codeobj:
|
||||||
|
token_list.append((bytecode, len(tokenizer(bytecode)[0])))
|
||||||
|
|
||||||
|
codeobj_tokens.append(token_list)
|
||||||
|
|
||||||
|
windows = [sliding_window(codeobj, MAX_WINDOW_LENGTH, STEP_SIZE) for codeobj in codeobj_tokens]
|
||||||
|
|
||||||
|
# remake examples using our windows
|
||||||
|
examples["bytecode"] = []
|
||||||
|
|
||||||
|
# go through each window
|
||||||
|
for window in windows:
|
||||||
|
for item in window:
|
||||||
|
# where we will temporarily store our bytecode and bounds
|
||||||
|
bytecode = []
|
||||||
|
|
||||||
|
for bc in item[0]:
|
||||||
|
bytecode.append(bc)
|
||||||
|
|
||||||
|
# append to examples
|
||||||
|
examples["bytecode"].append(bytecode_separator.join(bytecode))
|
||||||
|
|
||||||
|
return tokenizer(examples["bytecode"], max_length=max_length, truncation=True)
|
||||||
|
|
||||||
|
tokenized_dataset = raw_dataset.map(
|
||||||
|
tokenize,
|
||||||
|
batched=True,
|
||||||
|
remove_columns=column_names,
|
||||||
|
num_proc=os.cpu_count(),
|
||||||
|
desc="Tokenizing datasets",
|
||||||
|
)
|
||||||
|
|
||||||
|
return tokenized_dataset
|
||||||
|
|
||||||
|
|
||||||
|
def load_pretrained_mlm(
|
||||||
|
pretrained_mlm_repo_name: str,
|
||||||
|
tokenizer_embedding_length: int,
|
||||||
|
cache_dir: pathlib.Path,
|
||||||
|
) -> AutoModelForMaskedLM:
|
||||||
|
# load a basic pretrained BERT model
|
||||||
|
pretrained_mlm_dir = cache_dir / "models" / pretrained_mlm_repo_name
|
||||||
|
model = AutoModelForMaskedLM.from_pretrained(pretrained_mlm_repo_name, cache_dir=str(pretrained_mlm_dir))
|
||||||
|
|
||||||
|
# resize token embeddings to fit the model
|
||||||
|
model.resize_token_embeddings(tokenizer_embedding_length)
|
||||||
|
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def initialize_untrained_mlm(
|
||||||
|
tokenizer_embedding_length: int,
|
||||||
|
max_token_length: int,
|
||||||
|
) -> RobertaForMaskedLM:
|
||||||
|
# initialize untrained RoBERTa model
|
||||||
|
# most configuration options set to match https://huggingface.co/microsoft/codebert-base/blob/main/config.json for direct comparison
|
||||||
|
model_config = RobertaConfig(
|
||||||
|
max_position_embeddings=max_token_length, # INPUT LENGTH LIMIT
|
||||||
|
vocab_size=tokenizer_embedding_length,
|
||||||
|
layer_norm_eps=1e-05,
|
||||||
|
type_vocab_size=1,
|
||||||
|
)
|
||||||
|
model = RobertaForMaskedLM(model_config)
|
||||||
|
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def train_mlm(config: SegmentationConfiguration):
|
||||||
|
if repo_exists(config.base_repo_name):
|
||||||
|
logging.error(f"{config.base_repo_name} has already exists")
|
||||||
|
exit(1)
|
||||||
|
|
||||||
|
using_pretrained_model = bool(config.pretrained_mlm_repo_name)
|
||||||
|
# train model, for now the configuration comes from a regular T5 translation model.
|
||||||
|
training_args = TrainingArguments(
|
||||||
|
output_dir=str(config.mlm_dir),
|
||||||
|
num_train_epochs=config.mlm_training_parameters.epochs,
|
||||||
|
per_device_train_batch_size=config.mlm_training_parameters.batch_size,
|
||||||
|
save_steps=1000,
|
||||||
|
save_total_limit=5,
|
||||||
|
prediction_loss_only=True,
|
||||||
|
push_to_hub=True,
|
||||||
|
hub_model_id=config.mlm_repo_name,
|
||||||
|
hub_private_repo=True,
|
||||||
|
ddp_backend="nccl",
|
||||||
|
ddp_find_unused_parameters=using_pretrained_model, # only look for unused parameters in pretrained models
|
||||||
|
remove_unused_columns=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
tokenizer = load_tokenizer(config.tokenizer_repo_name, config.cache_dir)
|
||||||
|
|
||||||
|
# Set DataCollator for MLM task, set the probability of masking.
|
||||||
|
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=True, mlm_probability=0.15)
|
||||||
|
|
||||||
|
if using_pretrained_model:
|
||||||
|
pretrained_mlm = load_pretrained_mlm(config.pretrained_mlm_repo_name, len(tokenizer), config.cache_dir)
|
||||||
|
else:
|
||||||
|
pretrained_mlm = initialize_untrained_mlm(len(tokenizer), config.max_token_length + 2)
|
||||||
|
|
||||||
|
tokenized_training_data = load_tokenized_train_dataset(config.dataset_repo_name, tokenizer, config.max_token_length, config.cache_dir)
|
||||||
|
|
||||||
|
# Hugging face trainer: a Trainer class to fine-tune pretrained models
|
||||||
|
trainer = Trainer(
|
||||||
|
model=pretrained_mlm,
|
||||||
|
args=training_args,
|
||||||
|
data_collator=data_collator,
|
||||||
|
train_dataset=tokenized_training_data,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Training
|
||||||
|
trainer.train()
|
||||||
|
|
||||||
|
if int(os.environ["LOCAL_RANK"]) == 0:
|
||||||
|
# Save the model
|
||||||
|
trainer.save_model(config.mlm_dir)
|
||||||
|
|
||||||
|
trainer.push_to_hub(
|
||||||
|
finetuned_from=config.pretrained_mlm_repo_name,
|
||||||
|
dataset=config.dataset_repo_name,
|
||||||
|
commit_message=f"Trained on {config.dataset_repo_name} using {config.tokenizer_repo_name}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Training script for the masked language model pretraining for the segmentation model given a segmentation json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
segmentation_config = parse_segmentation_config_json(json_file_path)
|
||||||
|
train_mlm(segmentation_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,155 @@
|
|||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import pathlib
|
||||||
|
import click
|
||||||
|
|
||||||
|
import evaluate
|
||||||
|
import numpy as np
|
||||||
|
from datasets import ReadInstruction, load_dataset
|
||||||
|
from huggingface_hub import hf_hub_download, repo_exists
|
||||||
|
from SegmentationConfiguration import SegmentationConfiguration, parse_segmentation_config_json
|
||||||
|
from transformers import AutoModelForTokenClassification, DataCollatorForTokenClassification, PreTrainedTokenizerFast, Trainer, TrainingArguments
|
||||||
|
|
||||||
|
# two dictionaries, id2label and label2id, which contain the mappings from ID to label and vice versa.
|
||||||
|
label_names = ["B", "I", "E"]
|
||||||
|
id2label = {str(i): label for i, label in enumerate(label_names)}
|
||||||
|
label2id = {v: k for k, v in id2label.items()}
|
||||||
|
|
||||||
|
|
||||||
|
# compute_metrics: evaluate metric for training and evaluation.
|
||||||
|
def compute_metrics(eval_preds):
|
||||||
|
metric = evaluate.load("seqeval")
|
||||||
|
logits, labels = eval_preds
|
||||||
|
predictions = np.argmax(logits, axis=-1)
|
||||||
|
|
||||||
|
# Remove ignored index (special tokens) and convert to labels
|
||||||
|
# noqa: E741
|
||||||
|
true_labels = [[label_names[l] for l in label if l != -100] for label in labels]
|
||||||
|
true_predictions = [[label_names[p] for (p, l) in zip(prediction, label) if l != -100] for prediction, label in zip(predictions, labels)]
|
||||||
|
all_metrics = metric.compute(predictions=true_predictions, references=true_labels)
|
||||||
|
return {
|
||||||
|
"precision": all_metrics["overall_precision"],
|
||||||
|
"recall": all_metrics["overall_recall"],
|
||||||
|
"f1": all_metrics["overall_f1"],
|
||||||
|
"accuracy": all_metrics["overall_accuracy"],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenizer(tokenizer_repo_name: str, cache_dir: pathlib.Path) -> PreTrainedTokenizerFast:
|
||||||
|
tokenizer_dir = cache_dir / "tokenizers" / tokenizer_repo_name
|
||||||
|
|
||||||
|
tokenizer_file = hf_hub_download(
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
filename="tokenizer.json",
|
||||||
|
token=True,
|
||||||
|
cache_dir=str(tokenizer_dir),
|
||||||
|
)
|
||||||
|
tokenizer = PreTrainedTokenizerFast(
|
||||||
|
tokenizer_file=tokenizer_file,
|
||||||
|
unk_token="[UNK]",
|
||||||
|
pad_token="[PAD]",
|
||||||
|
cls_token="[CLS]",
|
||||||
|
sep_token="[SEP]",
|
||||||
|
mask_token="[MASK]",
|
||||||
|
)
|
||||||
|
|
||||||
|
return tokenizer
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenized_train_and_valid_dataset(dataset_repo_name: str, cache_dir: pathlib.Path, dataset_percentage: int = 100):
|
||||||
|
dataset_dir = cache_dir / "datasets" / dataset_repo_name
|
||||||
|
# Load the tokenized dataset
|
||||||
|
tokenized_train_dataset = load_dataset(
|
||||||
|
dataset_repo_name,
|
||||||
|
token=True,
|
||||||
|
cache_dir=str(dataset_dir),
|
||||||
|
split=ReadInstruction("train", to=dataset_percentage, unit="%"),
|
||||||
|
)
|
||||||
|
|
||||||
|
tokenized_validation_dataset = load_dataset(
|
||||||
|
dataset_repo_name,
|
||||||
|
token=True,
|
||||||
|
cache_dir=str(dataset_dir),
|
||||||
|
split="valid",
|
||||||
|
)
|
||||||
|
|
||||||
|
return tokenized_train_dataset, tokenized_validation_dataset
|
||||||
|
|
||||||
|
|
||||||
|
def train_segmentation_model(config: SegmentationConfiguration):
|
||||||
|
if repo_exists(config.base_repo_name):
|
||||||
|
logging.error(f"{config.base_repo_name} has already exists")
|
||||||
|
exit(1)
|
||||||
|
# training arguments.
|
||||||
|
training_args = TrainingArguments(
|
||||||
|
output_dir=str(config.segmenter_dir),
|
||||||
|
overwrite_output_dir=True,
|
||||||
|
eval_strategy="epoch",
|
||||||
|
logging_strategy="epoch",
|
||||||
|
save_strategy="epoch",
|
||||||
|
learning_rate=config.segmentation_training_parameters.learning_rate,
|
||||||
|
num_train_epochs=config.segmentation_training_parameters.epochs,
|
||||||
|
per_device_train_batch_size=config.segmentation_training_parameters.batch_size,
|
||||||
|
save_steps=1000,
|
||||||
|
weight_decay=0.01,
|
||||||
|
fp16=True,
|
||||||
|
push_to_hub=True,
|
||||||
|
hub_model_id=config.segmenter_repo_name,
|
||||||
|
hub_private_repo=True,
|
||||||
|
ddp_backend="nccl",
|
||||||
|
ddp_find_unused_parameters=True,
|
||||||
|
save_total_limit=5,
|
||||||
|
)
|
||||||
|
|
||||||
|
# load a basic pretrained BERT model
|
||||||
|
model = AutoModelForTokenClassification.from_pretrained(
|
||||||
|
pretrained_model_name_or_path=config.mlm_repo_name,
|
||||||
|
id2label=id2label,
|
||||||
|
label2id=label2id,
|
||||||
|
token=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Set DataCollator for DataCollatorForTokenClassification
|
||||||
|
tokenizer = load_tokenizer(config.tokenizer_repo_name, config.cache_dir)
|
||||||
|
data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer, max_length=config.max_token_length)
|
||||||
|
|
||||||
|
(
|
||||||
|
tokenized_train_dataset,
|
||||||
|
tokenized_validation_dataset,
|
||||||
|
) = load_tokenized_train_and_valid_dataset(config.tokenized_dataset_repo_name, config.cache_dir, config.dataset_percentage)
|
||||||
|
|
||||||
|
# Hugging face trainer: a Trainer class to fine-tune pretrained models
|
||||||
|
trainer = Trainer(
|
||||||
|
model=model,
|
||||||
|
args=training_args,
|
||||||
|
data_collator=data_collator,
|
||||||
|
train_dataset=tokenized_train_dataset,
|
||||||
|
eval_dataset=tokenized_validation_dataset,
|
||||||
|
compute_metrics=compute_metrics,
|
||||||
|
tokenizer=tokenizer,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Training
|
||||||
|
trainer.train()
|
||||||
|
|
||||||
|
if int(os.environ["LOCAL_RANK"]) == 0:
|
||||||
|
# Save the model
|
||||||
|
trainer.save_model(str(config.segmenter_dir))
|
||||||
|
|
||||||
|
trainer.push_to_hub(
|
||||||
|
finetuned_from=config.mlm_repo_name,
|
||||||
|
dataset=config.tokenized_dataset_repo_name,
|
||||||
|
commit_message=f"Trained on {config.tokenized_dataset_repo_name} using {config.mlm_repo_name}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Training script for the segmentation model given a segmentation json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
segmentation_config = parse_segmentation_config_json(json_file_path)
|
||||||
|
train_segmentation_model(segmentation_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,96 @@
|
|||||||
|
import logging
|
||||||
|
import pathlib
|
||||||
|
import click
|
||||||
|
|
||||||
|
from datasets import ReadInstruction, load_dataset
|
||||||
|
from huggingface_hub import HfApi, create_repo, repo_exists
|
||||||
|
from SegmentationConfiguration import SegmentationConfiguration, parse_segmentation_config_json
|
||||||
|
from tokenizers import Tokenizer, decoders, models, normalizers, pre_tokenizers, processors, trainers
|
||||||
|
|
||||||
|
special_tokens = ["[UNK]", "[PAD]", "[CLS]", "[SEP]", "[MASK]"]
|
||||||
|
|
||||||
|
|
||||||
|
def get_untrained_tokenizer() -> Tokenizer:
|
||||||
|
# WordPiece tokenization for BERT.
|
||||||
|
tokenizer = Tokenizer(models.WordPiece(unk_token="[UNK]"))
|
||||||
|
|
||||||
|
# The normalizer recognizes the accented characters and strip them out.
|
||||||
|
tokenizer.normalizer = normalizers.Sequence([normalizers.NFD(), normalizers.StripAccents()])
|
||||||
|
|
||||||
|
# The pre-tokenizer splits on <SEP> tokens.
|
||||||
|
tokenizer.pre_tokenizer = pre_tokenizers.Split("<SEP>", "removed")
|
||||||
|
|
||||||
|
return tokenizer
|
||||||
|
|
||||||
|
|
||||||
|
def post_training_configuration(tokenizer: Tokenizer):
|
||||||
|
cls_token_id = tokenizer.token_to_id("[CLS]")
|
||||||
|
sep_token_id = tokenizer.token_to_id("[SEP]")
|
||||||
|
|
||||||
|
# Set decoder for the tokenizer
|
||||||
|
tokenizer.decoder = decoders.WordPiece(prefix="##")
|
||||||
|
|
||||||
|
# For the TemplateProcessor, we have to specify how to treat a single sentence and a pair of sentences.
|
||||||
|
tokenizer.post_processor = processors.TemplateProcessing(
|
||||||
|
single="[CLS]:0 $A:0 [SEP]:0",
|
||||||
|
pair="[CLS]:0 $A:0 [SEP]:0 $B:1 [SEP]:1",
|
||||||
|
special_tokens=[("[CLS]", cls_token_id), ("[SEP]", sep_token_id)],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def save_and_upload_tokenizer(
|
||||||
|
tokenizer: Tokenizer,
|
||||||
|
tokenizer_json_path: pathlib.Path,
|
||||||
|
tokenizer_repo_name: str,
|
||||||
|
dataset_name: str,
|
||||||
|
):
|
||||||
|
# save the tokenizer locally
|
||||||
|
tokenizer_json_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
tokenizer.save(str(tokenizer_json_path.resolve()))
|
||||||
|
|
||||||
|
# upload tokenizer to huggingface
|
||||||
|
api = HfApi()
|
||||||
|
create_repo(tokenizer_repo_name, exist_ok=True, private=True)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="tokenizer.json",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.resolve()),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message=f"Trained tokenizer using {dataset_name}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def train_tokenizer(config: SegmentationConfiguration):
|
||||||
|
if repo_exists(config.base_repo_name):
|
||||||
|
logging.error(f"{config.base_repo_name} has already exists")
|
||||||
|
exit(1)
|
||||||
|
|
||||||
|
tokenizer = get_untrained_tokenizer()
|
||||||
|
|
||||||
|
train_dataset = load_dataset(
|
||||||
|
config.dataset_repo_name,
|
||||||
|
token=True,
|
||||||
|
split=ReadInstruction("train", to=config.dataset_percentage, unit="%"),
|
||||||
|
)["bytecode"]
|
||||||
|
trainer = trainers.WordPieceTrainer(vocab_size=30000, special_tokens=special_tokens)
|
||||||
|
tokenizer.train_from_iterator(train_dataset, trainer=trainer)
|
||||||
|
|
||||||
|
post_training_configuration(tokenizer)
|
||||||
|
|
||||||
|
save_and_upload_tokenizer(
|
||||||
|
tokenizer,
|
||||||
|
config.tokenizer_json_path,
|
||||||
|
config.tokenizer_repo_name,
|
||||||
|
config.dataset_repo_name,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Training script for the bytecode tokenizer for the segmentation model given a segmentation json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
segmentation_config = parse_segmentation_config_json(json_file_path)
|
||||||
|
train_tokenizer(segmentation_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
# seq2seq
|
||||||
|
|
||||||
|
- train_tokenizer_auto.py:
|
||||||
|
- trains the manual tokenizer
|
||||||
|
|
||||||
|
- tokenize_seq2seq.py:
|
||||||
|
- tokenize the dataset for the seq2seq model
|
||||||
|
|
||||||
|
- train_seq2seq.py:
|
||||||
|
- finetuning the pretrained model
|
||||||
|
- will create a sequence-to-sequence translation model
|
||||||
|
|
||||||
|
- StatementConfiguration.py
|
||||||
|
- defines the JSON format for statement translation training
|
||||||
|
|
||||||
|
# manual1
|
||||||
|
|
||||||
|
Contains JSONs mapping bytecode instructions and their configurations to use in training.
|
||||||
@@ -0,0 +1,59 @@
|
|||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
import pathlib
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TrainingParameters:
|
||||||
|
batch_size: int
|
||||||
|
epochs: int
|
||||||
|
learning_rate: float
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class StatementConfiguration:
|
||||||
|
base_repo_name: str
|
||||||
|
dataset_repo_name: str
|
||||||
|
tokenizer_repo_name: str
|
||||||
|
pretrained_seq2seq_repo_name: str
|
||||||
|
cache_dir: pathlib.Path
|
||||||
|
max_token_length: int
|
||||||
|
dataset_percentage: int
|
||||||
|
do_eval: bool
|
||||||
|
fp16: bool
|
||||||
|
statement_training_parameters: TrainingParameters
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tokenized_dataset_repo_name(self):
|
||||||
|
return self.dataset_repo_name + "-tokenized"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def statement_model_repo_name(self):
|
||||||
|
return self.base_repo_name + "-statement"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def statement_model_dir(self):
|
||||||
|
return self.cache_dir / "models" / self.statement_model_repo_name
|
||||||
|
|
||||||
|
@property
|
||||||
|
def log_dir(self):
|
||||||
|
return self.statement_model_dir / "logs"
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.cache_dir = pathlib.Path(self.cache_dir)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_statement_config_json(json_file_path: pathlib.Path, logger: logging.Logger = None) -> StatementConfiguration:
|
||||||
|
if not json_file_path.exists():
|
||||||
|
raise FileNotFoundError(f"{json_file_path} does not exist")
|
||||||
|
|
||||||
|
if logger:
|
||||||
|
logger.info(f"Loading model description from {json_file_path}...")
|
||||||
|
|
||||||
|
with json_file_path.open() as json_file:
|
||||||
|
statement_config_dict = json.load(json_file)
|
||||||
|
|
||||||
|
statement_config_dict["statement_training_parameters"] = TrainingParameters(**statement_config_dict["statement_training_parameters"])
|
||||||
|
return StatementConfiguration(**statement_config_dict)
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
import os
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import click
|
||||||
|
import pathlib
|
||||||
|
|
||||||
|
from datasets import load_dataset
|
||||||
|
from transformers import RobertaTokenizer
|
||||||
|
|
||||||
|
from StatementConfiguration import StatementConfiguration, parse_statement_config_json
|
||||||
|
|
||||||
|
import functools
|
||||||
|
|
||||||
|
|
||||||
|
def preprocess_function(tokenizer: RobertaTokenizer, max_token_length: int, input_key: str, examples: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
"""Set up Huggingface tokenizers for both inputs and targets"""
|
||||||
|
inputs = [ex if ex else "" for ex in examples[input_key]]
|
||||||
|
targets = [ex if ex else "" for ex in examples["source"]]
|
||||||
|
|
||||||
|
return tokenizer(text=inputs, text_target=targets, max_length=max_token_length, truncation=True)
|
||||||
|
|
||||||
|
|
||||||
|
def tokenize_seq2seq_dataset(config: StatementConfiguration):
|
||||||
|
# ref: https://huggingface.co/Salesforce/codet5-base
|
||||||
|
tokenizer = RobertaTokenizer.from_pretrained(config.tokenizer_repo_name)
|
||||||
|
raw_datasets = load_dataset(config.dataset_repo_name, token=True)
|
||||||
|
|
||||||
|
column_names = raw_datasets["train"].column_names
|
||||||
|
input_key = "bytecode"
|
||||||
|
prepped_preprocess_function = functools.partial(preprocess_function, tokenizer, config.max_token_length, input_key)
|
||||||
|
tokenized_datasets = raw_datasets.map(
|
||||||
|
prepped_preprocess_function,
|
||||||
|
batched=True,
|
||||||
|
remove_columns=column_names,
|
||||||
|
num_proc=os.cpu_count(),
|
||||||
|
desc="Tokenizing datasets",
|
||||||
|
)
|
||||||
|
|
||||||
|
tokenized_datasets.push_to_hub(config.tokenized_dataset_repo_name, private=True)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Tokenization script for Statement Translation model given a statement json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
statement_config = parse_statement_config_json(json_file_path)
|
||||||
|
tokenize_seq2seq_dataset(statement_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,929 @@
|
|||||||
|
{
|
||||||
|
"add_prefix_space": false,
|
||||||
|
"additional_special_tokens": [
|
||||||
|
"<pad>",
|
||||||
|
"<s>",
|
||||||
|
"</s>",
|
||||||
|
"<unk>",
|
||||||
|
"<mask>",
|
||||||
|
"!",
|
||||||
|
"\"",
|
||||||
|
"#",
|
||||||
|
"$",
|
||||||
|
"%",
|
||||||
|
"&",
|
||||||
|
"'",
|
||||||
|
"(",
|
||||||
|
")",
|
||||||
|
"*",
|
||||||
|
"+",
|
||||||
|
",",
|
||||||
|
"-",
|
||||||
|
".",
|
||||||
|
"/",
|
||||||
|
"0",
|
||||||
|
"1",
|
||||||
|
"2",
|
||||||
|
"3",
|
||||||
|
"4",
|
||||||
|
"5",
|
||||||
|
"6",
|
||||||
|
"7",
|
||||||
|
"8",
|
||||||
|
"9",
|
||||||
|
":",
|
||||||
|
";",
|
||||||
|
"<",
|
||||||
|
"=",
|
||||||
|
">",
|
||||||
|
"?",
|
||||||
|
"@",
|
||||||
|
"A",
|
||||||
|
"B",
|
||||||
|
"C",
|
||||||
|
"D",
|
||||||
|
"E",
|
||||||
|
"F",
|
||||||
|
"G",
|
||||||
|
"H",
|
||||||
|
"I",
|
||||||
|
"J",
|
||||||
|
"K",
|
||||||
|
"L",
|
||||||
|
"M",
|
||||||
|
"N",
|
||||||
|
"O",
|
||||||
|
"P",
|
||||||
|
"Q",
|
||||||
|
"R",
|
||||||
|
"S",
|
||||||
|
"T",
|
||||||
|
"U",
|
||||||
|
"V",
|
||||||
|
"W",
|
||||||
|
"X",
|
||||||
|
"Y",
|
||||||
|
"Z",
|
||||||
|
"[",
|
||||||
|
"\\",
|
||||||
|
"]",
|
||||||
|
"^",
|
||||||
|
"_",
|
||||||
|
"`",
|
||||||
|
"a",
|
||||||
|
"b",
|
||||||
|
"c",
|
||||||
|
"d",
|
||||||
|
"e",
|
||||||
|
"f",
|
||||||
|
"g",
|
||||||
|
"h",
|
||||||
|
"i",
|
||||||
|
"j",
|
||||||
|
"k",
|
||||||
|
"l",
|
||||||
|
"m",
|
||||||
|
"n",
|
||||||
|
"o",
|
||||||
|
"p",
|
||||||
|
"q",
|
||||||
|
"r",
|
||||||
|
"s",
|
||||||
|
"t",
|
||||||
|
"u",
|
||||||
|
"v",
|
||||||
|
"w",
|
||||||
|
"x",
|
||||||
|
"y",
|
||||||
|
"z",
|
||||||
|
"{",
|
||||||
|
"|",
|
||||||
|
"}",
|
||||||
|
"~",
|
||||||
|
"Ġ",
|
||||||
|
"-=",
|
||||||
|
"<<",
|
||||||
|
">>",
|
||||||
|
":=",
|
||||||
|
">=",
|
||||||
|
"<=",
|
||||||
|
"==",
|
||||||
|
"!=",
|
||||||
|
"+=",
|
||||||
|
"//=",
|
||||||
|
"**=",
|
||||||
|
"/=",
|
||||||
|
"//",
|
||||||
|
"%=",
|
||||||
|
"@=",
|
||||||
|
"&=",
|
||||||
|
"|=",
|
||||||
|
"^=",
|
||||||
|
">>=",
|
||||||
|
"<<=",
|
||||||
|
"*=",
|
||||||
|
"()",
|
||||||
|
"):",
|
||||||
|
"~>>",
|
||||||
|
"**",
|
||||||
|
"<codeobj:",
|
||||||
|
"<KWARG_PAD>",
|
||||||
|
"E->",
|
||||||
|
"<TAP_0>",
|
||||||
|
"defaults",
|
||||||
|
"args:",
|
||||||
|
"vararg:",
|
||||||
|
"<TAP_1>",
|
||||||
|
"<START_LINE>",
|
||||||
|
"<SUB>",
|
||||||
|
"<TAP_UP>",
|
||||||
|
"E-END",
|
||||||
|
"<TAP_2>",
|
||||||
|
"~~>",
|
||||||
|
"<TAP_ST>",
|
||||||
|
"<SEP>",
|
||||||
|
"</SUB>",
|
||||||
|
"<TAP_3>",
|
||||||
|
"<TAP_4>",
|
||||||
|
"<TAP_5>",
|
||||||
|
"<TAP_6>",
|
||||||
|
"<TAP_7>",
|
||||||
|
"False",
|
||||||
|
"None",
|
||||||
|
"True",
|
||||||
|
"and",
|
||||||
|
"assert",
|
||||||
|
"async",
|
||||||
|
"await",
|
||||||
|
"break",
|
||||||
|
"class",
|
||||||
|
"continue",
|
||||||
|
"def",
|
||||||
|
"del",
|
||||||
|
"elif",
|
||||||
|
"else",
|
||||||
|
"else:",
|
||||||
|
"except",
|
||||||
|
"except:",
|
||||||
|
"finally",
|
||||||
|
"finally:",
|
||||||
|
"for",
|
||||||
|
"from",
|
||||||
|
"global",
|
||||||
|
"if",
|
||||||
|
"import",
|
||||||
|
"in",
|
||||||
|
"is",
|
||||||
|
"lambda",
|
||||||
|
"nonlocal",
|
||||||
|
"not",
|
||||||
|
"or",
|
||||||
|
"pass",
|
||||||
|
"raise",
|
||||||
|
"return",
|
||||||
|
"try",
|
||||||
|
"try:",
|
||||||
|
"while",
|
||||||
|
"with",
|
||||||
|
"yield",
|
||||||
|
"case",
|
||||||
|
"as",
|
||||||
|
"ASYNC_GEN_WRAP",
|
||||||
|
"BEFORE_ASYNC_WITH",
|
||||||
|
"BEFORE_WITH",
|
||||||
|
"BEGIN_FINALLY",
|
||||||
|
"BINARY_ADD",
|
||||||
|
"BINARY_AND",
|
||||||
|
"BINARY_FLOOR_DIVIDE",
|
||||||
|
"BINARY_LSHIFT",
|
||||||
|
"BINARY_MATRIX_MULTIPLY",
|
||||||
|
"BINARY_MODULO",
|
||||||
|
"BINARY_MULTIPLY",
|
||||||
|
"BINARY_OP",
|
||||||
|
"BINARY_OR",
|
||||||
|
"BINARY_POWER",
|
||||||
|
"BINARY_RSHIFT",
|
||||||
|
"BINARY_SLICE",
|
||||||
|
"BINARY_SUBSCR",
|
||||||
|
"BINARY_SUBTRACT",
|
||||||
|
"BINARY_TRUE_DIVIDE",
|
||||||
|
"BINARY_XOR",
|
||||||
|
"BREAK_LOOP",
|
||||||
|
"BUILD_CONST_KEY_MAP",
|
||||||
|
"BUILD_LIST",
|
||||||
|
"BUILD_LIST_UNPACK",
|
||||||
|
"BUILD_MAP",
|
||||||
|
"BUILD_MAP_UNPACK",
|
||||||
|
"BUILD_MAP_UNPACK_WITH_CALL",
|
||||||
|
"BUILD_SET",
|
||||||
|
"BUILD_SET_UNPACK",
|
||||||
|
"BUILD_SLICE",
|
||||||
|
"BUILD_STRING",
|
||||||
|
"BUILD_TUPLE",
|
||||||
|
"BUILD_TUPLE_UNPACK",
|
||||||
|
"BUILD_TUPLE_UNPACK_WITH_CALL",
|
||||||
|
"CACHE",
|
||||||
|
"CALL_FINALLY",
|
||||||
|
"CALL_FUNCTION",
|
||||||
|
"CALL_FUNCTION_EX",
|
||||||
|
"CALL_FUNCTION_KW",
|
||||||
|
"CALL_METHOD",
|
||||||
|
"CHECK_EG_MATCH",
|
||||||
|
"CHECK_EXC_MATCH",
|
||||||
|
"CLEANUP_THROW",
|
||||||
|
"COMPARE_OP",
|
||||||
|
"CONTAINS_OP",
|
||||||
|
"CONTINUE_LOOP",
|
||||||
|
"COPY",
|
||||||
|
"COPY_DICT_WITHOUT_KEYS",
|
||||||
|
"COPY_FREE_VARS",
|
||||||
|
"DELETE_ATTR",
|
||||||
|
"DELETE_DEREF",
|
||||||
|
"DELETE_FAST",
|
||||||
|
"DELETE_GLOBAL",
|
||||||
|
"DELETE_NAME",
|
||||||
|
"DELETE_SUBSCR",
|
||||||
|
"DICT_MERGE",
|
||||||
|
"DICT_UPDATE",
|
||||||
|
"DUP_TOP",
|
||||||
|
"DUP_TOP_TWO",
|
||||||
|
"END_ASYNC_FOR",
|
||||||
|
"END_FINALLY",
|
||||||
|
"END_FOR",
|
||||||
|
"END_SEND",
|
||||||
|
"EXTENDED_ARG",
|
||||||
|
"FOR_ITER",
|
||||||
|
"FORMAT_VALUE",
|
||||||
|
"GEN_START",
|
||||||
|
"GET_AITER",
|
||||||
|
"GET_ANEXT",
|
||||||
|
"GET_AWAITABLE",
|
||||||
|
"GET_ITER",
|
||||||
|
"GET_LEN",
|
||||||
|
"GET_YIELD_FROM_ITER",
|
||||||
|
"IMPORT_FROM",
|
||||||
|
"IMPORT_NAME",
|
||||||
|
"IMPORT_STAR",
|
||||||
|
"INPLACE_ADD",
|
||||||
|
"INPLACE_AND",
|
||||||
|
"INPLACE_FLOOR_DIVIDE",
|
||||||
|
"INPLACE_LSHIFT",
|
||||||
|
"INPLACE_MATRIX_MULTIPLY",
|
||||||
|
"INPLACE_MODULO",
|
||||||
|
"INPLACE_MULTIPLY",
|
||||||
|
"INPLACE_OR",
|
||||||
|
"INPLACE_POWER",
|
||||||
|
"INPLACE_RSHIFT",
|
||||||
|
"INPLACE_SUBTRACT",
|
||||||
|
"INPLACE_TRUE_DIVIDE",
|
||||||
|
"INPLACE_XOR",
|
||||||
|
"INTERPRETER_EXIT",
|
||||||
|
"IS_OP",
|
||||||
|
"JUMP_ABSOLUTE",
|
||||||
|
"JUMP_BACKWARD",
|
||||||
|
"JUMP_BACKWARD_NO_INTERRUPT",
|
||||||
|
"JUMP_FORWARD",
|
||||||
|
"JUMP_IF_FALSE_OR_POP",
|
||||||
|
"JUMP_IF_NOT_EXC_MATCH",
|
||||||
|
"JUMP_IF_TRUE_OR_POP",
|
||||||
|
"LIST_APPEND",
|
||||||
|
"LIST_EXTEND",
|
||||||
|
"LIST_TO_TUPLE",
|
||||||
|
"LOAD_ASSERTION_ERROR",
|
||||||
|
"LOAD_ATTR",
|
||||||
|
"LOAD_BUILD_CLASS",
|
||||||
|
"LOAD_CLASSDEREF",
|
||||||
|
"LOAD_CLOSURE",
|
||||||
|
"LOAD_CONST",
|
||||||
|
"LOAD_DEREF",
|
||||||
|
"LOAD_FAST",
|
||||||
|
"LOAD_FAST_AND_CLEAR",
|
||||||
|
"LOAD_FAST_CHECK",
|
||||||
|
"LOAD_GLOBAL",
|
||||||
|
"LOAD_LOCALS",
|
||||||
|
"LOAD_METHOD",
|
||||||
|
"LOAD_NAME",
|
||||||
|
"LOAD_SUPER_ATTR",
|
||||||
|
"MAKE_CELL",
|
||||||
|
"MAKE_FUNCTION",
|
||||||
|
"MAP_ADD",
|
||||||
|
"MATCH_CLASS",
|
||||||
|
"MATCH_KEYS",
|
||||||
|
"MATCH_MAPPING",
|
||||||
|
"MATCH_SEQUENCE",
|
||||||
|
"NOP",
|
||||||
|
"POP_BLOCK",
|
||||||
|
"POP_EXCEPT",
|
||||||
|
"POP_FINALLY",
|
||||||
|
"POP_JUMP_FORWARD_IF_FALSE",
|
||||||
|
"POP_JUMP_FORWARD_IF_NONE",
|
||||||
|
"POP_JUMP_FORWARD_IF_NOT_NONE",
|
||||||
|
"POP_JUMP_FORWARD_IF_TRUE",
|
||||||
|
"POP_JUMP_IF_FALSE",
|
||||||
|
"POP_JUMP_IF_NONE",
|
||||||
|
"POP_JUMP_IF_NOT_NONE",
|
||||||
|
"POP_JUMP_IF_TRUE",
|
||||||
|
"POP_TOP",
|
||||||
|
"PRECALL",
|
||||||
|
"PREP_RERAISE_STAR",
|
||||||
|
"PRINT_EXPR",
|
||||||
|
"PUSH_EXC_INFO",
|
||||||
|
"PUSH_NULL",
|
||||||
|
"RAISE_VARARGS",
|
||||||
|
"RERAISE",
|
||||||
|
"RESERVED",
|
||||||
|
"RESUME",
|
||||||
|
"RETURN_CONST",
|
||||||
|
"RETURN_GENERATOR",
|
||||||
|
"RETURN_VALUE",
|
||||||
|
"ROT_FOUR",
|
||||||
|
"ROT_N",
|
||||||
|
"ROT_THREE",
|
||||||
|
"ROT_TWO",
|
||||||
|
"SEND",
|
||||||
|
"SET_ADD",
|
||||||
|
"SET_UPDATE",
|
||||||
|
"SETUP_ANNOTATIONS",
|
||||||
|
"SETUP_ASYNC_WITH",
|
||||||
|
"SETUP_EXCEPT",
|
||||||
|
"SETUP_FINALLY",
|
||||||
|
"SETUP_LOOP",
|
||||||
|
"SETUP_WITH",
|
||||||
|
"STORE_ATTR",
|
||||||
|
"STORE_DEREF",
|
||||||
|
"STORE_FAST",
|
||||||
|
"STORE_GLOBAL",
|
||||||
|
"STORE_NAME",
|
||||||
|
"STORE_SLICE",
|
||||||
|
"STORE_SUBSCR",
|
||||||
|
"SWAP",
|
||||||
|
"UNARY_INVERT",
|
||||||
|
"UNARY_NEGATIVE",
|
||||||
|
"UNARY_NOT",
|
||||||
|
"UNARY_POSITIVE",
|
||||||
|
"UNPACK_EX",
|
||||||
|
"UNPACK_SEQUENCE",
|
||||||
|
"WITH_CLEANUP_FINISH",
|
||||||
|
"WITH_CLEANUP_START",
|
||||||
|
"WITH_EXCEPT_START",
|
||||||
|
"YIELD_FROM",
|
||||||
|
"YIELD_VALUE",
|
||||||
|
"<mask_0>",
|
||||||
|
"<mask_1>",
|
||||||
|
"<mask_2>",
|
||||||
|
"<mask_3>",
|
||||||
|
"<mask_4>",
|
||||||
|
"<mask_5>",
|
||||||
|
"<mask_6>",
|
||||||
|
"<mask_7>",
|
||||||
|
"<mask_8>",
|
||||||
|
"<mask_9>",
|
||||||
|
"<mask_10>",
|
||||||
|
"<mask_11>",
|
||||||
|
"<mask_12>",
|
||||||
|
"<mask_13>",
|
||||||
|
"<mask_14>",
|
||||||
|
"<mask_15>",
|
||||||
|
"<mask_16>",
|
||||||
|
"<mask_17>",
|
||||||
|
"<mask_18>",
|
||||||
|
"<mask_19>",
|
||||||
|
"<mask_20>",
|
||||||
|
"<mask_21>",
|
||||||
|
"<mask_22>",
|
||||||
|
"<mask_23>",
|
||||||
|
"<mask_24>",
|
||||||
|
"<mask_25>",
|
||||||
|
"<mask_26>",
|
||||||
|
"<mask_27>",
|
||||||
|
"<mask_28>",
|
||||||
|
"<mask_29>",
|
||||||
|
"<mask_30>",
|
||||||
|
"<mask_31>",
|
||||||
|
"<mask_32>",
|
||||||
|
"<mask_33>",
|
||||||
|
"<mask_34>",
|
||||||
|
"<mask_35>",
|
||||||
|
"<mask_36>",
|
||||||
|
"<mask_37>",
|
||||||
|
"<mask_38>",
|
||||||
|
"<mask_39>",
|
||||||
|
"<mask_40>",
|
||||||
|
"<mask_41>",
|
||||||
|
"<mask_42>",
|
||||||
|
"<mask_43>",
|
||||||
|
"<mask_44>",
|
||||||
|
"<mask_45>",
|
||||||
|
"<mask_46>",
|
||||||
|
"<mask_47>",
|
||||||
|
"<mask_48>",
|
||||||
|
"<mask_49>",
|
||||||
|
"<mask_50>",
|
||||||
|
"<mask_51>",
|
||||||
|
"<mask_52>",
|
||||||
|
"<mask_53>",
|
||||||
|
"<mask_54>",
|
||||||
|
"<mask_55>",
|
||||||
|
"<mask_56>",
|
||||||
|
"<mask_57>",
|
||||||
|
"<mask_58>",
|
||||||
|
"<mask_59>",
|
||||||
|
"<mask_60>",
|
||||||
|
"<mask_61>",
|
||||||
|
"<mask_62>",
|
||||||
|
"<mask_63>",
|
||||||
|
"<mask_64>",
|
||||||
|
"<mask_65>",
|
||||||
|
"<mask_66>",
|
||||||
|
"<mask_67>",
|
||||||
|
"<mask_68>",
|
||||||
|
"<mask_69>",
|
||||||
|
"<mask_70>",
|
||||||
|
"<mask_71>",
|
||||||
|
"<mask_72>",
|
||||||
|
"<mask_73>",
|
||||||
|
"<mask_74>",
|
||||||
|
"<mask_75>",
|
||||||
|
"<mask_76>",
|
||||||
|
"<mask_77>",
|
||||||
|
"<mask_78>",
|
||||||
|
"<mask_79>",
|
||||||
|
"<mask_80>",
|
||||||
|
"<mask_81>",
|
||||||
|
"<mask_82>",
|
||||||
|
"<mask_83>",
|
||||||
|
"<mask_84>",
|
||||||
|
"<mask_85>",
|
||||||
|
"<mask_86>",
|
||||||
|
"<mask_87>",
|
||||||
|
"<mask_88>",
|
||||||
|
"<mask_89>",
|
||||||
|
"<mask_90>",
|
||||||
|
"<mask_91>",
|
||||||
|
"<mask_92>",
|
||||||
|
"<mask_93>",
|
||||||
|
"<mask_94>",
|
||||||
|
"<mask_95>",
|
||||||
|
"<mask_96>",
|
||||||
|
"<mask_97>",
|
||||||
|
"<mask_98>",
|
||||||
|
"<mask_99>",
|
||||||
|
"<mask_100>",
|
||||||
|
"<mask_101>",
|
||||||
|
"<mask_102>",
|
||||||
|
"<mask_103>",
|
||||||
|
"<mask_104>",
|
||||||
|
"<mask_105>",
|
||||||
|
"<mask_106>",
|
||||||
|
"<mask_107>",
|
||||||
|
"<mask_108>",
|
||||||
|
"<mask_109>",
|
||||||
|
"<mask_110>",
|
||||||
|
"<mask_111>",
|
||||||
|
"<mask_112>",
|
||||||
|
"<mask_113>",
|
||||||
|
"<mask_114>",
|
||||||
|
"<mask_115>",
|
||||||
|
"<mask_116>",
|
||||||
|
"<mask_117>",
|
||||||
|
"<mask_118>",
|
||||||
|
"<mask_119>",
|
||||||
|
"<mask_120>",
|
||||||
|
"<mask_121>",
|
||||||
|
"<mask_122>",
|
||||||
|
"<mask_123>",
|
||||||
|
"<mask_124>",
|
||||||
|
"<mask_125>",
|
||||||
|
"<mask_126>",
|
||||||
|
"<mask_127>",
|
||||||
|
"<mask_128>",
|
||||||
|
"<mask_129>",
|
||||||
|
"<mask_130>",
|
||||||
|
"<mask_131>",
|
||||||
|
"<mask_132>",
|
||||||
|
"<mask_133>",
|
||||||
|
"<mask_134>",
|
||||||
|
"<mask_135>",
|
||||||
|
"<mask_136>",
|
||||||
|
"<mask_137>",
|
||||||
|
"<mask_138>",
|
||||||
|
"<mask_139>",
|
||||||
|
"<mask_140>",
|
||||||
|
"<mask_141>",
|
||||||
|
"<mask_142>",
|
||||||
|
"<mask_143>",
|
||||||
|
"<mask_144>",
|
||||||
|
"<mask_145>",
|
||||||
|
"<mask_146>",
|
||||||
|
"<mask_147>",
|
||||||
|
"<mask_148>",
|
||||||
|
"<mask_149>",
|
||||||
|
"<mask_150>",
|
||||||
|
"<mask_151>",
|
||||||
|
"<mask_152>",
|
||||||
|
"<mask_153>",
|
||||||
|
"<mask_154>",
|
||||||
|
"<mask_155>",
|
||||||
|
"<mask_156>",
|
||||||
|
"<mask_157>",
|
||||||
|
"<mask_158>",
|
||||||
|
"<mask_159>",
|
||||||
|
"<mask_160>",
|
||||||
|
"<mask_161>",
|
||||||
|
"<mask_162>",
|
||||||
|
"<mask_163>",
|
||||||
|
"<mask_164>",
|
||||||
|
"<mask_165>",
|
||||||
|
"<mask_166>",
|
||||||
|
"<mask_167>",
|
||||||
|
"<mask_168>",
|
||||||
|
"<mask_169>",
|
||||||
|
"<mask_170>",
|
||||||
|
"<mask_171>",
|
||||||
|
"<mask_172>",
|
||||||
|
"<mask_173>",
|
||||||
|
"<mask_174>",
|
||||||
|
"<mask_175>",
|
||||||
|
"<mask_176>",
|
||||||
|
"<mask_177>",
|
||||||
|
"<mask_178>",
|
||||||
|
"<mask_179>",
|
||||||
|
"<mask_180>",
|
||||||
|
"<mask_181>",
|
||||||
|
"<mask_182>",
|
||||||
|
"<mask_183>",
|
||||||
|
"<mask_184>",
|
||||||
|
"<mask_185>",
|
||||||
|
"<mask_186>",
|
||||||
|
"<mask_187>",
|
||||||
|
"<mask_188>",
|
||||||
|
"<mask_189>",
|
||||||
|
"<mask_190>",
|
||||||
|
"<mask_191>",
|
||||||
|
"<mask_192>",
|
||||||
|
"<mask_193>",
|
||||||
|
"<mask_194>",
|
||||||
|
"<mask_195>",
|
||||||
|
"<mask_196>",
|
||||||
|
"<mask_197>",
|
||||||
|
"<mask_198>",
|
||||||
|
"<mask_199>",
|
||||||
|
"<mask_200>",
|
||||||
|
"<mask_201>",
|
||||||
|
"<mask_202>",
|
||||||
|
"<mask_203>",
|
||||||
|
"<mask_204>",
|
||||||
|
"<mask_205>",
|
||||||
|
"<mask_206>",
|
||||||
|
"<mask_207>",
|
||||||
|
"<mask_208>",
|
||||||
|
"<mask_209>",
|
||||||
|
"<mask_210>",
|
||||||
|
"<mask_211>",
|
||||||
|
"<mask_212>",
|
||||||
|
"<mask_213>",
|
||||||
|
"<mask_214>",
|
||||||
|
"<mask_215>",
|
||||||
|
"<mask_216>",
|
||||||
|
"<mask_217>",
|
||||||
|
"<mask_218>",
|
||||||
|
"<mask_219>",
|
||||||
|
"<mask_220>",
|
||||||
|
"<mask_221>",
|
||||||
|
"<mask_222>",
|
||||||
|
"<mask_223>",
|
||||||
|
"<mask_224>",
|
||||||
|
"<mask_225>",
|
||||||
|
"<mask_226>",
|
||||||
|
"<mask_227>",
|
||||||
|
"<mask_228>",
|
||||||
|
"<mask_229>",
|
||||||
|
"<mask_230>",
|
||||||
|
"<mask_231>",
|
||||||
|
"<mask_232>",
|
||||||
|
"<mask_233>",
|
||||||
|
"<mask_234>",
|
||||||
|
"<mask_235>",
|
||||||
|
"<mask_236>",
|
||||||
|
"<mask_237>",
|
||||||
|
"<mask_238>",
|
||||||
|
"<mask_239>",
|
||||||
|
"<mask_240>",
|
||||||
|
"<mask_241>",
|
||||||
|
"<mask_242>",
|
||||||
|
"<mask_243>",
|
||||||
|
"<mask_244>",
|
||||||
|
"<mask_245>",
|
||||||
|
"<mask_246>",
|
||||||
|
"<mask_247>",
|
||||||
|
"<mask_248>",
|
||||||
|
"<mask_249>",
|
||||||
|
"<mask_250>",
|
||||||
|
"<mask_251>",
|
||||||
|
"<mask_252>",
|
||||||
|
"<mask_253>",
|
||||||
|
"<mask_254>",
|
||||||
|
"<mask_255>",
|
||||||
|
"<mask_256>",
|
||||||
|
"<mask_257>",
|
||||||
|
"<mask_258>",
|
||||||
|
"<mask_259>",
|
||||||
|
"<mask_260>",
|
||||||
|
"<mask_261>",
|
||||||
|
"<mask_262>",
|
||||||
|
"<mask_263>",
|
||||||
|
"<mask_264>",
|
||||||
|
"<mask_265>",
|
||||||
|
"<mask_266>",
|
||||||
|
"<mask_267>",
|
||||||
|
"<mask_268>",
|
||||||
|
"<mask_269>",
|
||||||
|
"<mask_270>",
|
||||||
|
"<mask_271>",
|
||||||
|
"<mask_272>",
|
||||||
|
"<mask_273>",
|
||||||
|
"<mask_274>",
|
||||||
|
"<mask_275>",
|
||||||
|
"<mask_276>",
|
||||||
|
"<mask_277>",
|
||||||
|
"<mask_278>",
|
||||||
|
"<mask_279>",
|
||||||
|
"<mask_280>",
|
||||||
|
"<mask_281>",
|
||||||
|
"<mask_282>",
|
||||||
|
"<mask_283>",
|
||||||
|
"<mask_284>",
|
||||||
|
"<mask_285>",
|
||||||
|
"<mask_286>",
|
||||||
|
"<mask_287>",
|
||||||
|
"<mask_288>",
|
||||||
|
"<mask_289>",
|
||||||
|
"<mask_290>",
|
||||||
|
"<mask_291>",
|
||||||
|
"<mask_292>",
|
||||||
|
"<mask_293>",
|
||||||
|
"<mask_294>",
|
||||||
|
"<mask_295>",
|
||||||
|
"<mask_296>",
|
||||||
|
"<mask_297>",
|
||||||
|
"<mask_298>",
|
||||||
|
"<mask_299>",
|
||||||
|
"<mask_300>",
|
||||||
|
"<mask_301>",
|
||||||
|
"<mask_302>",
|
||||||
|
"<mask_303>",
|
||||||
|
"<mask_304>",
|
||||||
|
"<mask_305>",
|
||||||
|
"<mask_306>",
|
||||||
|
"<mask_307>",
|
||||||
|
"<mask_308>",
|
||||||
|
"<mask_309>",
|
||||||
|
"<mask_310>",
|
||||||
|
"<mask_311>",
|
||||||
|
"<mask_312>",
|
||||||
|
"<mask_313>",
|
||||||
|
"<mask_314>",
|
||||||
|
"<mask_315>",
|
||||||
|
"<mask_316>",
|
||||||
|
"<mask_317>",
|
||||||
|
"<mask_318>",
|
||||||
|
"<mask_319>",
|
||||||
|
"<mask_320>",
|
||||||
|
"<mask_321>",
|
||||||
|
"<mask_322>",
|
||||||
|
"<mask_323>",
|
||||||
|
"<mask_324>",
|
||||||
|
"<mask_325>",
|
||||||
|
"<mask_326>",
|
||||||
|
"<mask_327>",
|
||||||
|
"<mask_328>",
|
||||||
|
"<mask_329>",
|
||||||
|
"<mask_330>",
|
||||||
|
"<mask_331>",
|
||||||
|
"<mask_332>",
|
||||||
|
"<mask_333>",
|
||||||
|
"<mask_334>",
|
||||||
|
"<mask_335>",
|
||||||
|
"<mask_336>",
|
||||||
|
"<mask_337>",
|
||||||
|
"<mask_338>",
|
||||||
|
"<mask_339>",
|
||||||
|
"<mask_340>",
|
||||||
|
"<mask_341>",
|
||||||
|
"<mask_342>",
|
||||||
|
"<mask_343>",
|
||||||
|
"<mask_344>",
|
||||||
|
"<mask_345>",
|
||||||
|
"<mask_346>",
|
||||||
|
"<mask_347>",
|
||||||
|
"<mask_348>",
|
||||||
|
"<mask_349>",
|
||||||
|
"<mask_350>",
|
||||||
|
"<mask_351>",
|
||||||
|
"<mask_352>",
|
||||||
|
"<mask_353>",
|
||||||
|
"<mask_354>",
|
||||||
|
"<mask_355>",
|
||||||
|
"<mask_356>",
|
||||||
|
"<mask_357>",
|
||||||
|
"<mask_358>",
|
||||||
|
"<mask_359>",
|
||||||
|
"<mask_360>",
|
||||||
|
"<mask_361>",
|
||||||
|
"<mask_362>",
|
||||||
|
"<mask_363>",
|
||||||
|
"<mask_364>",
|
||||||
|
"<mask_365>",
|
||||||
|
"<mask_366>",
|
||||||
|
"<mask_367>",
|
||||||
|
"<mask_368>",
|
||||||
|
"<mask_369>",
|
||||||
|
"<mask_370>",
|
||||||
|
"<mask_371>",
|
||||||
|
"<mask_372>",
|
||||||
|
"<mask_373>",
|
||||||
|
"<mask_374>",
|
||||||
|
"<mask_375>",
|
||||||
|
"<mask_376>",
|
||||||
|
"<mask_377>",
|
||||||
|
"<mask_378>",
|
||||||
|
"<mask_379>",
|
||||||
|
"<mask_380>",
|
||||||
|
"<mask_381>",
|
||||||
|
"<mask_382>",
|
||||||
|
"<mask_383>",
|
||||||
|
"<extra_id_99>",
|
||||||
|
"<extra_id_98>",
|
||||||
|
"<extra_id_97>",
|
||||||
|
"<extra_id_96>",
|
||||||
|
"<extra_id_95>",
|
||||||
|
"<extra_id_94>",
|
||||||
|
"<extra_id_93>",
|
||||||
|
"<extra_id_92>",
|
||||||
|
"<extra_id_91>",
|
||||||
|
"<extra_id_90>",
|
||||||
|
"<extra_id_89>",
|
||||||
|
"<extra_id_88>",
|
||||||
|
"<extra_id_87>",
|
||||||
|
"<extra_id_86>",
|
||||||
|
"<extra_id_85>",
|
||||||
|
"<extra_id_84>",
|
||||||
|
"<extra_id_83>",
|
||||||
|
"<extra_id_82>",
|
||||||
|
"<extra_id_81>",
|
||||||
|
"<extra_id_80>",
|
||||||
|
"<extra_id_79>",
|
||||||
|
"<extra_id_78>",
|
||||||
|
"<extra_id_77>",
|
||||||
|
"<extra_id_76>",
|
||||||
|
"<extra_id_75>",
|
||||||
|
"<extra_id_74>",
|
||||||
|
"<extra_id_73>",
|
||||||
|
"<extra_id_72>",
|
||||||
|
"<extra_id_71>",
|
||||||
|
"<extra_id_70>",
|
||||||
|
"<extra_id_69>",
|
||||||
|
"<extra_id_68>",
|
||||||
|
"<extra_id_67>",
|
||||||
|
"<extra_id_66>",
|
||||||
|
"<extra_id_65>",
|
||||||
|
"<extra_id_64>",
|
||||||
|
"<extra_id_63>",
|
||||||
|
"<extra_id_62>",
|
||||||
|
"<extra_id_61>",
|
||||||
|
"<extra_id_60>",
|
||||||
|
"<extra_id_59>",
|
||||||
|
"<extra_id_58>",
|
||||||
|
"<extra_id_57>",
|
||||||
|
"<extra_id_56>",
|
||||||
|
"<extra_id_55>",
|
||||||
|
"<extra_id_54>",
|
||||||
|
"<extra_id_53>",
|
||||||
|
"<extra_id_52>",
|
||||||
|
"<extra_id_51>",
|
||||||
|
"<extra_id_50>",
|
||||||
|
"<extra_id_49>",
|
||||||
|
"<extra_id_48>",
|
||||||
|
"<extra_id_47>",
|
||||||
|
"<extra_id_46>",
|
||||||
|
"<extra_id_45>",
|
||||||
|
"<extra_id_44>",
|
||||||
|
"<extra_id_43>",
|
||||||
|
"<extra_id_42>",
|
||||||
|
"<extra_id_41>",
|
||||||
|
"<extra_id_40>",
|
||||||
|
"<extra_id_39>",
|
||||||
|
"<extra_id_38>",
|
||||||
|
"<extra_id_37>",
|
||||||
|
"<extra_id_36>",
|
||||||
|
"<extra_id_35>",
|
||||||
|
"<extra_id_34>",
|
||||||
|
"<extra_id_33>",
|
||||||
|
"<extra_id_32>",
|
||||||
|
"<extra_id_31>",
|
||||||
|
"<extra_id_30>",
|
||||||
|
"<extra_id_29>",
|
||||||
|
"<extra_id_28>",
|
||||||
|
"<extra_id_27>",
|
||||||
|
"<extra_id_26>",
|
||||||
|
"<extra_id_25>",
|
||||||
|
"<extra_id_24>",
|
||||||
|
"<extra_id_23>",
|
||||||
|
"<extra_id_22>",
|
||||||
|
"<extra_id_21>",
|
||||||
|
"<extra_id_20>",
|
||||||
|
"<extra_id_19>",
|
||||||
|
"<extra_id_18>",
|
||||||
|
"<extra_id_17>",
|
||||||
|
"<extra_id_16>",
|
||||||
|
"<extra_id_15>",
|
||||||
|
"<extra_id_14>",
|
||||||
|
"<extra_id_13>",
|
||||||
|
"<extra_id_12>",
|
||||||
|
"<extra_id_11>",
|
||||||
|
"<extra_id_10>",
|
||||||
|
"<extra_id_9>",
|
||||||
|
"<extra_id_8>",
|
||||||
|
"<extra_id_7>",
|
||||||
|
"<extra_id_6>",
|
||||||
|
"<extra_id_5>",
|
||||||
|
"<extra_id_4>",
|
||||||
|
"<extra_id_3>",
|
||||||
|
"<extra_id_2>",
|
||||||
|
"<extra_id_1>",
|
||||||
|
"<extra_id_0>",
|
||||||
|
"match",
|
||||||
|
"type",
|
||||||
|
"HAVE_ARGUMENT",
|
||||||
|
"CALL_INTRINSIC_1",
|
||||||
|
"CALL_INTRINSIC_2",
|
||||||
|
"JUMP_NO_INTERRUPT",
|
||||||
|
"nargs",
|
||||||
|
"vargs",
|
||||||
|
"compare",
|
||||||
|
"name",
|
||||||
|
"const",
|
||||||
|
"local"
|
||||||
|
],
|
||||||
|
"bos_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"clean_up_tokenization_spaces": true,
|
||||||
|
"cls_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"eos_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"errors": "replace",
|
||||||
|
"mask_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<mask>",
|
||||||
|
"lstrip": true,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"model_max_length": 512,
|
||||||
|
"pad_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<pad>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"sep_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"tokenizer_class": "RobertaTokenizer",
|
||||||
|
"trim_offsets": true,
|
||||||
|
"unk_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<unk>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": true,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,93 @@
|
|||||||
|
import os
|
||||||
|
import pathlib
|
||||||
|
import time
|
||||||
|
from datetime import timedelta
|
||||||
|
import click
|
||||||
|
|
||||||
|
from datasets import ReadInstruction, load_dataset
|
||||||
|
from StatementConfiguration import StatementConfiguration, parse_statement_config_json
|
||||||
|
from transformers import (
|
||||||
|
DataCollatorForSeq2Seq,
|
||||||
|
RobertaTokenizer,
|
||||||
|
Seq2SeqTrainer,
|
||||||
|
Seq2SeqTrainingArguments,
|
||||||
|
T5ForConditionalGeneration,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def load_tokenized_train_dataset(dataset_repo_name: str, dataset_percentage: int):
|
||||||
|
# Load the tokenized dataset
|
||||||
|
tokenized_train_dataset = load_dataset(
|
||||||
|
dataset_repo_name,
|
||||||
|
token=True,
|
||||||
|
split=ReadInstruction("train", to=dataset_percentage, unit="%"),
|
||||||
|
)
|
||||||
|
return tokenized_train_dataset
|
||||||
|
|
||||||
|
|
||||||
|
def train_statement_model(config: StatementConfiguration):
|
||||||
|
# load model, Salesforce/codet5-base is a pretrained model solving the code generation task.
|
||||||
|
tokenizer = RobertaTokenizer.from_pretrained(config.tokenizer_repo_name)
|
||||||
|
model = T5ForConditionalGeneration.from_pretrained(config.pretrained_seq2seq_repo_name)
|
||||||
|
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)
|
||||||
|
|
||||||
|
model_dir = str(config.statement_model_dir)
|
||||||
|
model_repo_name = config.statement_model_repo_name
|
||||||
|
|
||||||
|
train_args = Seq2SeqTrainingArguments(
|
||||||
|
output_dir=model_dir,
|
||||||
|
learning_rate=config.statement_training_parameters.learning_rate,
|
||||||
|
per_device_train_batch_size=config.statement_training_parameters.batch_size,
|
||||||
|
per_device_eval_batch_size=config.statement_training_parameters.batch_size,
|
||||||
|
weight_decay=0.01,
|
||||||
|
fp16=config.fp16,
|
||||||
|
logging_dir=str(config.log_dir),
|
||||||
|
report_to="tensorboard",
|
||||||
|
logging_strategy="steps",
|
||||||
|
logging_steps=1000,
|
||||||
|
save_strategy="steps",
|
||||||
|
save_steps=10000,
|
||||||
|
save_total_limit=2,
|
||||||
|
num_train_epochs=config.statement_training_parameters.epochs,
|
||||||
|
predict_with_generate=True,
|
||||||
|
push_to_hub=True,
|
||||||
|
hub_model_id=model_repo_name,
|
||||||
|
hub_private_repo=True,
|
||||||
|
ddp_backend="nccl",
|
||||||
|
ddp_find_unused_parameters=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
tokenized_train_dataset = load_tokenized_train_dataset(config.tokenized_dataset_repo_name, config.dataset_percentage)
|
||||||
|
trainer = Seq2SeqTrainer(
|
||||||
|
model=model,
|
||||||
|
args=train_args,
|
||||||
|
data_collator=data_collator,
|
||||||
|
train_dataset=tokenized_train_dataset,
|
||||||
|
tokenizer=tokenizer,
|
||||||
|
)
|
||||||
|
|
||||||
|
start = time.time()
|
||||||
|
trainer.train()
|
||||||
|
duration = str(timedelta(seconds=time.time() - start))
|
||||||
|
|
||||||
|
if int(os.environ["LOCAL_RANK"]) == 0:
|
||||||
|
# upload the latest version of the model to the Model Hub on Huggingface
|
||||||
|
trainer.save_model(str(config.statement_model_dir))
|
||||||
|
# this command returns the URL of the commit it just did
|
||||||
|
trainer.push_to_hub(
|
||||||
|
commit_message=duration,
|
||||||
|
finetuned_from=config.pretrained_seq2seq_repo_name,
|
||||||
|
dataset=config.tokenized_dataset_repo_name,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Training script for the statement translation model given a statement json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
statement_config = parse_statement_config_json(json_file_path)
|
||||||
|
train_statement_model(statement_config)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,93 @@
|
|||||||
|
import logging
|
||||||
|
import pathlib
|
||||||
|
import click
|
||||||
|
|
||||||
|
from datasets import ReadInstruction, load_dataset
|
||||||
|
from huggingface_hub import HfApi, repo_exists
|
||||||
|
from StatementConfiguration import StatementConfiguration, parse_statement_config_json
|
||||||
|
from tokenizers import Tokenizer
|
||||||
|
from transformers import AutoTokenizer
|
||||||
|
|
||||||
|
|
||||||
|
def get_untrained_tokenizer(tokenizer_repo_name: str) -> AutoTokenizer:
|
||||||
|
tokenizer_dir = pathlib.Path(__file__).parent / tokenizer_repo_name
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained(tokenizer_dir)
|
||||||
|
return tokenizer
|
||||||
|
|
||||||
|
|
||||||
|
def save_and_upload_tokenizer(
|
||||||
|
tokenizer: Tokenizer,
|
||||||
|
tokenizer_json_path: pathlib.Path,
|
||||||
|
tokenizer_repo_name: str,
|
||||||
|
dataset_name: str,
|
||||||
|
):
|
||||||
|
# Save the tokenizer locally
|
||||||
|
tokenizer.save_pretrained(str(tokenizer_json_path.parent.resolve()))
|
||||||
|
|
||||||
|
# Upload files to Hugging Face Hub
|
||||||
|
api = HfApi()
|
||||||
|
api.create_repo(tokenizer_repo_name, exist_ok=True, private=True)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="tokenizer_config.json",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.parent / "tokenizer_config.json"),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message=f"Trained tokenizer using {dataset_name}",
|
||||||
|
)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="vocab.json",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.parent / "vocab.json"),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message="Extracted vocabulary from tokenizer",
|
||||||
|
)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="merges.txt",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.parent / "merges.txt"),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message="Extracted merges from tokenizer",
|
||||||
|
)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="tokenizer.json",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.parent / "tokenizer.json"),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message="Extracted tokenizer",
|
||||||
|
)
|
||||||
|
api.upload_file(
|
||||||
|
path_in_repo="special_tokens_map.json",
|
||||||
|
path_or_fileobj=str(tokenizer_json_path.parent / "special_tokens_map.json"),
|
||||||
|
repo_id=tokenizer_repo_name,
|
||||||
|
commit_message="Extracted special tokens map",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def train_tokenizer(config: StatementConfiguration, tokenizer_json_path: pathlib.Path):
|
||||||
|
if repo_exists(config.base_repo_name):
|
||||||
|
logging.error(f"{config.base_repo_name} has already exists")
|
||||||
|
exit(1)
|
||||||
|
|
||||||
|
tokenizer = get_untrained_tokenizer("tokenizer")
|
||||||
|
|
||||||
|
train_dataset = load_dataset(
|
||||||
|
config.dataset_repo_name,
|
||||||
|
token=True,
|
||||||
|
split=ReadInstruction("train", to=config.dataset_percentage, unit="%"),
|
||||||
|
)["bytecode"]
|
||||||
|
|
||||||
|
tokenizer = tokenizer.train_new_from_iterator(train_dataset, vocab_size=30000)
|
||||||
|
save_and_upload_tokenizer(
|
||||||
|
tokenizer,
|
||||||
|
tokenizer_json_path,
|
||||||
|
config.tokenizer_repo_name,
|
||||||
|
config.dataset_repo_name,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Training script for the bytecode tokenizer for the statement model given a statement json.")
|
||||||
|
@click.argument("json_path", type=str)
|
||||||
|
def main(json_path: str):
|
||||||
|
json_file_path = pathlib.Path(json_path)
|
||||||
|
statement_config = parse_statement_config_json(json_file_path)
|
||||||
|
train_tokenizer(statement_config, json_file_path)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,117 @@
|
|||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import pathlib
|
||||||
|
import subprocess
|
||||||
|
import click
|
||||||
|
|
||||||
|
from pylingual.utils.get_logger import get_logger
|
||||||
|
|
||||||
|
|
||||||
|
def train_segmentation(segmentation_config_path: pathlib.Path, logger: logging.Logger, nnodes: int = 1, nproc_per_node: int = 1, rdzv_port: int = 29400):
|
||||||
|
segmentation_root = pathlib.Path(__file__).parent / "segmentation"
|
||||||
|
|
||||||
|
# train tokenizer
|
||||||
|
logger.info("training tokenizer...")
|
||||||
|
subprocess.run(["python", segmentation_root / "train_tokenizer.py", segmentation_config_path])
|
||||||
|
|
||||||
|
# train mlm (single gpu to avoid conflicts with local tokenized data)
|
||||||
|
logger.info("training masked language model...")
|
||||||
|
subprocess.run(
|
||||||
|
[
|
||||||
|
"torchrun",
|
||||||
|
f"--nnodes={nnodes}",
|
||||||
|
f"--nproc-per-node={nproc_per_node}",
|
||||||
|
"--rdzv-backend=c10d",
|
||||||
|
f"--rdzv-endpoint=localhost:{rdzv_port}",
|
||||||
|
segmentation_root / "train_mlm.py",
|
||||||
|
segmentation_config_path,
|
||||||
|
],
|
||||||
|
env=dict(os.environ, NCCL_P2P_DISABLE="1"),
|
||||||
|
)
|
||||||
|
|
||||||
|
# tokenize dataset
|
||||||
|
logger.info("tokenizing segmentation dataset...")
|
||||||
|
subprocess.run(["python", segmentation_root / "tokenize_seg.py", segmentation_config_path])
|
||||||
|
|
||||||
|
# train segmentation model (4 gpus)
|
||||||
|
logger.info("training segmentation model...")
|
||||||
|
subprocess.run(
|
||||||
|
[
|
||||||
|
"torchrun",
|
||||||
|
f"--nnodes={nnodes}",
|
||||||
|
f"--nproc-per-node={nproc_per_node}",
|
||||||
|
"--rdzv-backend=c10d",
|
||||||
|
f"--rdzv-endpoint=localhost:{rdzv_port}",
|
||||||
|
segmentation_root / "train_seg.py",
|
||||||
|
segmentation_config_path,
|
||||||
|
],
|
||||||
|
env=dict(os.environ, NCCL_P2P_DISABLE="1"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def train_statement(statement_config_path: pathlib.Path, logger: logging.Logger, nnodes: int = 1, nproc_per_node: int = 1, rdzv_port: int = 29400):
|
||||||
|
statement_root = pathlib.Path(__file__).parent / "statement"
|
||||||
|
|
||||||
|
# manual tokenizer
|
||||||
|
subprocess.run(["python", statement_root / "train_tokenizer_auto.py", statement_config_path])
|
||||||
|
|
||||||
|
# tokenize statement dataset with salesforce tokenizer
|
||||||
|
logger.info("tokenizing statement dataset...")
|
||||||
|
subprocess.run(["python", statement_root / "tokenize_seq2seq.py", statement_config_path])
|
||||||
|
|
||||||
|
# train statement model (4 gpus)
|
||||||
|
logger.info("training statement model...")
|
||||||
|
subprocess.run(
|
||||||
|
[
|
||||||
|
"torchrun",
|
||||||
|
f"--nnodes={nnodes}",
|
||||||
|
f"--nproc-per-node={nproc_per_node}",
|
||||||
|
"--rdzv-backend=c10d",
|
||||||
|
f"--rdzv-endpoint=localhost:{rdzv_port}",
|
||||||
|
statement_root / "train_seq2seq.py",
|
||||||
|
statement_config_path,
|
||||||
|
],
|
||||||
|
env=dict(os.environ, NCCL_P2P_DISABLE="1"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(help="Full tokenization and training pipeline for the segmentation and statement translation models.")
|
||||||
|
@click.option("--segmentation", type=str, default=None, help="The path to the segmentation model description JSON file.")
|
||||||
|
@click.option("--statement", type=str, default=None, help="The path to the statement model description JSON file.")
|
||||||
|
@click.option("--nnodes", type=int, default=1, help="Torchrun nnodes arg")
|
||||||
|
@click.option("--nproc_per_node", type=int, default=1, help="Torchrun nproc_per_node arg")
|
||||||
|
@click.option("--rdzv_port", "-p", type=int, default=29400, help="Port to use for torchrun rendezvous endpoint")
|
||||||
|
def main(segmentation: str, statement: str, nnodes: int, nproc_per_node: int, rdzv_port: int):
|
||||||
|
logger = get_logger("train-models")
|
||||||
|
|
||||||
|
### LOAD JSON
|
||||||
|
logger.info("Training pipeline starting...")
|
||||||
|
logger.info("Loading dataset description JSON files...")
|
||||||
|
|
||||||
|
### CONFIG_PATHS
|
||||||
|
segmentation_config_path = pathlib.Path(segmentation).resolve() if segmentation is not None else None
|
||||||
|
statement_config_path = pathlib.Path(statement).resolve() if statement is not None else None
|
||||||
|
|
||||||
|
logger.info("Dataset description JSON files loaded!")
|
||||||
|
|
||||||
|
### TRAIN SEGMENTATION
|
||||||
|
if segmentation_config_path is not None:
|
||||||
|
logger.info("Segmentation model training starting...")
|
||||||
|
train_segmentation(segmentation_config_path, logger, nnodes, nproc_per_node, rdzv_port)
|
||||||
|
logger.info("Segmentation model training complete!")
|
||||||
|
else:
|
||||||
|
logger.warning("Segmentation model configuration json path not provided in --segmentation; skipping segmentation model training...")
|
||||||
|
|
||||||
|
### TRAIN STATEMENT
|
||||||
|
if statement_config_path is not None:
|
||||||
|
logger.info("Statement model training starting...")
|
||||||
|
train_statement(statement_config_path, logger, nnodes, nproc_per_node, rdzv_port)
|
||||||
|
logger.info("Statement model training complete!")
|
||||||
|
else:
|
||||||
|
logger.warning("Statement model configuration json path not provided in --statement; skipping statement model training...")
|
||||||
|
|
||||||
|
logger.info("Training pipeline complete!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Generated
+3044
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,3 @@
|
|||||||
|
from .decompiler import decompile
|
||||||
|
|
||||||
|
__all__ = ["decompile"]
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from typing import Callable, Any
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode.control_flow_graph import ControlFlowEdgeType
|
||||||
|
|
||||||
|
|
||||||
|
def get_out_edge_dict(cfg: nx.DiGraph, node) -> dict:
|
||||||
|
edge_dict = {"natural": (None, None), "conditional": (None, None), "exception": (None, None)}
|
||||||
|
if node is None:
|
||||||
|
return edge_dict
|
||||||
|
|
||||||
|
out_edges = cfg.out_edges(nbunch=node, data=True)
|
||||||
|
for source, target, edge_props in out_edges:
|
||||||
|
if edge_props["type"] in [ControlFlowEdgeType.NATURAL.value, ControlFlowEdgeType.JUMP.value]:
|
||||||
|
edge_dict["natural"] = (target, edge_props)
|
||||||
|
elif edge_props["type"] in [ControlFlowEdgeType.TRUE_JUMP.value, ControlFlowEdgeType.FALSE_JUMP.value]:
|
||||||
|
edge_dict["conditional"] = (target, edge_props)
|
||||||
|
elif edge_props["type"] == ControlFlowEdgeType.EXCEPTION.value:
|
||||||
|
edge_dict["exception"] = (target, edge_props)
|
||||||
|
elif edge_props["type"] == ControlFlowEdgeType.META.value:
|
||||||
|
pass # ignore meta edges in graph traversal
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unknown edge type {edge_props['type']}")
|
||||||
|
return edge_dict
|
||||||
|
|
||||||
|
|
||||||
|
def _to_iter(item):
|
||||||
|
"""Converts something to an iterable version"""
|
||||||
|
if not hasattr(item, "__iter__") or isinstance(item, str):
|
||||||
|
return (item,)
|
||||||
|
return item
|
||||||
|
|
||||||
|
|
||||||
|
def create_dominator_tree(graph, start_node=None):
|
||||||
|
"""Creates a dominator tree for the given graph"""
|
||||||
|
|
||||||
|
# default start node is the minimum offset node
|
||||||
|
if start_node is None:
|
||||||
|
get_start_offset = lambda node: min(_to_iter(graph.nodes.data()[node].get("offset", ())), default=float("inf"))
|
||||||
|
start_node = min(graph.nodes, key=get_start_offset)
|
||||||
|
|
||||||
|
dominator_tree = nx.create_empty_copy(graph)
|
||||||
|
dominator_tree.add_edges_from(nx.immediate_dominators(graph, start_node).items())
|
||||||
|
dominator_tree.remove_edge(start_node, start_node)
|
||||||
|
return dominator_tree.reverse()
|
||||||
|
|
||||||
|
|
||||||
|
def get_dominator_function(cfg: nx.DiGraph) -> Callable[[Any, Any], bool]:
|
||||||
|
# preprocessing to identify loop headers; dominator tree cached in cfg so we don't recompute unless the graph changed
|
||||||
|
if not hasattr(cfg, "dominator_tree"):
|
||||||
|
cfg.dominator_tree = create_dominator_tree(cfg, start_node="START")
|
||||||
|
cfg.domination_relation = nx.transitive_closure_dag(cfg.dominator_tree)
|
||||||
|
|
||||||
|
def dominates(a, b):
|
||||||
|
return cfg.domination_relation.has_edge(a, b) or a == b
|
||||||
|
|
||||||
|
return dominates
|
||||||
@@ -0,0 +1,57 @@
|
|||||||
|
import os
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode import EditableBytecode
|
||||||
|
from pylingual.editable_bytecode.control_flow_graph import bytecode_to_control_flow_graph
|
||||||
|
from pylingual.utils.use_escape_sequences import use_escape_sequences
|
||||||
|
|
||||||
|
from .structure_control_flow import structure_control_flow
|
||||||
|
|
||||||
|
|
||||||
|
def pyc_to_indented_sources(pyc: EditableBytecode, source_lines: list[str]) -> dict[object, str]:
|
||||||
|
sources = {}
|
||||||
|
for bytecode in pyc.iter_bytecodes():
|
||||||
|
sources[bytecode.codeobj] = bytecode_to_indented_source(bytecode, source_lines)
|
||||||
|
return sources
|
||||||
|
|
||||||
|
|
||||||
|
def split_newlines(li):
|
||||||
|
return "\n".join(li).split("\n")
|
||||||
|
|
||||||
|
|
||||||
|
def bytecode_to_indented_source(bytecode: EditableBytecode, source_lines: list[str]) -> list[str]:
|
||||||
|
cfg = bytecode_to_control_flow_graph(bytecode)
|
||||||
|
|
||||||
|
# breakpoint to debug control flow templates if DEBUG_CFLOW is set
|
||||||
|
if os.environ.get("DEBUG_CFLOW", None) == "1":
|
||||||
|
breakpoint()
|
||||||
|
|
||||||
|
structured = structure_control_flow(cfg, bytecode)
|
||||||
|
indented_source = structured.to_indented_source(source_lines).split("\n")
|
||||||
|
|
||||||
|
bytecode.ordered_instructions = structured.get_instructions()
|
||||||
|
# force generator if necessary
|
||||||
|
if bytecode.codeobj.co_flags & (0x20 | 0x200):
|
||||||
|
if not any(x.strip().startswith("yield ") or x.strip() == "yield" for x in split_newlines(indented_source)):
|
||||||
|
indented_source.insert(0, "if False: yield # inserted")
|
||||||
|
|
||||||
|
# insert globals
|
||||||
|
for global_var in bytecode.globals:
|
||||||
|
indented_source.insert(0, f"global {global_var} # inserted")
|
||||||
|
|
||||||
|
# insert nonlocals
|
||||||
|
parent_nonlocal = set()
|
||||||
|
parent = bytecode.parent
|
||||||
|
while parent:
|
||||||
|
parent_nonlocal |= parent.nonlocals
|
||||||
|
parent = parent.parent
|
||||||
|
for nonlocal_var in bytecode.nonlocals:
|
||||||
|
if nonlocal_var in parent_nonlocal:
|
||||||
|
indented_source.insert(0, f"nonlocal {nonlocal_var} # inserted")
|
||||||
|
|
||||||
|
# add function docstring
|
||||||
|
if bytecode.codeobj.co_flags & 0x2:
|
||||||
|
if bytecode.codeobj.co_consts and isinstance(bytecode.codeobj.co_consts[0], str):
|
||||||
|
doc = use_escape_sequences(bytecode.codeobj.co_consts[0])
|
||||||
|
indented_source.insert(0, f'"""{doc}""" # inserted')
|
||||||
|
|
||||||
|
return [line for line in indented_source if line] # filter out empty strings
|
||||||
@@ -0,0 +1,171 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from ..cfg_utils import get_out_edge_dict
|
||||||
|
|
||||||
|
from typing import Any, Callable
|
||||||
|
from .abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
# INVARIANT: each node will have at most one of each "natural", "conditional", and "exception" edge
|
||||||
|
|
||||||
|
|
||||||
|
class TemplateEdge:
|
||||||
|
def __init__(self, source: Any, dest: Any, edge_verification_func: Callable[[Any, Any, dict], bool] = None, commit_none_to_mapping: bool = True) -> None:
|
||||||
|
self.source = source
|
||||||
|
self.dest = dest
|
||||||
|
self.edge_verification_func = edge_verification_func
|
||||||
|
# for optional edges, toggle if the absence of a node will be committed to the mapping
|
||||||
|
# set to False for edges that may not exist, even when their template destination may be reachable from other nodes
|
||||||
|
self.commit_none_to_mapping = commit_none_to_mapping
|
||||||
|
|
||||||
|
def check_edge(self, graph_source: Any, graph_dest: Any, graph_edge_properties: dict) -> bool:
|
||||||
|
# if no verification function is provided, just check that the edge exists
|
||||||
|
if self.edge_verification_func is None:
|
||||||
|
return graph_dest is not None
|
||||||
|
|
||||||
|
return self.edge_verification_func(graph_source, graph_dest, graph_edge_properties)
|
||||||
|
|
||||||
|
|
||||||
|
class TemplateNode:
|
||||||
|
def __init__(
|
||||||
|
self, node_verification_func: Callable[[nx.DiGraph, Any], bool] = None, natural_edge: TemplateEdge = None, conditional_edge: TemplateEdge = None, exception_edge: TemplateEdge = None, subtemplate: ControlFlowTemplate = None
|
||||||
|
) -> None:
|
||||||
|
self.node_verification_func = node_verification_func
|
||||||
|
self.natural_edge = natural_edge
|
||||||
|
self.conditional_edge = conditional_edge
|
||||||
|
self.exception_edge = exception_edge
|
||||||
|
self.subtemplate = subtemplate
|
||||||
|
|
||||||
|
def check_node(self, cfg: nx.DiGraph, node: Any) -> bool:
|
||||||
|
# I am not a valid candidate, so this is not a valid mapping
|
||||||
|
# it is the job of the node verification func to check in_degree
|
||||||
|
if self.node_verification_func is None:
|
||||||
|
if node is None:
|
||||||
|
return False
|
||||||
|
elif not self.node_verification_func(cfg, node):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# check the outgoing edges for this node
|
||||||
|
node_out_edge_dict = get_out_edge_dict(cfg, node)
|
||||||
|
natural_target, natural_properties = node_out_edge_dict["natural"] if node_out_edge_dict["natural"] else (None, None)
|
||||||
|
# if the edge is in the template, it must be valid
|
||||||
|
if self.natural_edge and not self.natural_edge.check_edge(node, natural_target, natural_properties):
|
||||||
|
return False
|
||||||
|
# if the edge is not in the template, reject
|
||||||
|
if natural_target and not self.natural_edge:
|
||||||
|
return False
|
||||||
|
|
||||||
|
conditional_target, conditional_properties = node_out_edge_dict["conditional"] if node_out_edge_dict["conditional"] else (None, None)
|
||||||
|
# if the edge is in the template, it must be valid
|
||||||
|
if self.conditional_edge and not self.conditional_edge.check_edge(node, conditional_target, conditional_properties):
|
||||||
|
return False
|
||||||
|
# if the edge is not in the template, reject
|
||||||
|
if conditional_target and not self.conditional_edge:
|
||||||
|
return False
|
||||||
|
|
||||||
|
exception_target, exception_properties = node_out_edge_dict["exception"] if node_out_edge_dict["exception"] else (None, None)
|
||||||
|
# if the edge is in the template, it must be valid
|
||||||
|
if self.exception_edge and not self.exception_edge.check_edge(node, exception_target, exception_properties):
|
||||||
|
return False
|
||||||
|
# if the edge is not in the template, reject
|
||||||
|
if exception_target and not self.exception_edge:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# node is good and all outgoing edges are good
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
class GraphTemplateMatcher:
|
||||||
|
def __init__(self, template_node_dict: dict[Any, TemplateNode], root_key: Any, mapping_verification_func: Callable[[nx.DiGraph, dict], bool]) -> None:
|
||||||
|
self.template_node_dict = template_node_dict
|
||||||
|
self.root_key = root_key
|
||||||
|
self.mapping_verification_func = mapping_verification_func
|
||||||
|
|
||||||
|
def match_at_graph_node(self, cfg: nx.DiGraph, root_node: Any) -> dict:
|
||||||
|
mapping = dict()
|
||||||
|
mapped_nodes = set()
|
||||||
|
|
||||||
|
dfs_stack = [(self.root_key, root_node)]
|
||||||
|
|
||||||
|
original_cfg = cfg # save this reference for later
|
||||||
|
|
||||||
|
while dfs_stack:
|
||||||
|
current_template_key, current_graph_node = dfs_stack.pop()
|
||||||
|
current_template_node = self.template_node_dict[current_template_key]
|
||||||
|
# if the template node has already been mapped, we don't process it again
|
||||||
|
if current_template_key in mapping:
|
||||||
|
# if the current template node has been mapped inconsistently, then the mapping failed
|
||||||
|
if mapping[current_template_key] != current_graph_node:
|
||||||
|
return None
|
||||||
|
else:
|
||||||
|
continue
|
||||||
|
if current_graph_node in mapped_nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# try to match the node subtemplate if one was provided
|
||||||
|
# if there is a match, then update the cfg under consideration, ensuring that nodes don't get double-mapped
|
||||||
|
if current_template_node.subtemplate:
|
||||||
|
updated_cfg = current_template_node.subtemplate.try_to_match_node(cfg, current_graph_node)
|
||||||
|
# if we didn't match the subtemplate, then this node matching failed
|
||||||
|
if not updated_cfg:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# check that previously mapped nodes did not get removed
|
||||||
|
for mapped_node in mapping.values():
|
||||||
|
if mapped_node is not None and mapped_node not in updated_cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# update the current graph node
|
||||||
|
added_nodes = set(updated_cfg.nodes) - set(cfg.nodes)
|
||||||
|
# enforce invariant that templates add no more than one node
|
||||||
|
assert len(added_nodes) <= 1
|
||||||
|
if added_nodes:
|
||||||
|
current_graph_node = added_nodes.pop()
|
||||||
|
|
||||||
|
# update the cfg
|
||||||
|
cfg = updated_cfg
|
||||||
|
|
||||||
|
# if the node is not a valid match, then the mapping failed
|
||||||
|
# check_node also checks all the outgoing edges
|
||||||
|
if not current_template_node.check_node(cfg, current_graph_node):
|
||||||
|
return None
|
||||||
|
|
||||||
|
mapping[current_template_key] = current_graph_node
|
||||||
|
mapped_nodes.add(current_graph_node)
|
||||||
|
|
||||||
|
graph_node_out_edge_dict = get_out_edge_dict(cfg, current_graph_node)
|
||||||
|
|
||||||
|
# extend along the natural edge
|
||||||
|
if current_template_node.natural_edge:
|
||||||
|
next_template_key = current_template_node.natural_edge.dest
|
||||||
|
if next_template_key is not None:
|
||||||
|
next_graph_node, _ = graph_node_out_edge_dict["natural"] if graph_node_out_edge_dict["natural"] else (None, None)
|
||||||
|
if next_graph_node is not None or current_template_node.natural_edge.commit_none_to_mapping:
|
||||||
|
dfs_stack.append((next_template_key, next_graph_node))
|
||||||
|
|
||||||
|
# extend along the conditional edge
|
||||||
|
if current_template_node.conditional_edge:
|
||||||
|
next_template_key = current_template_node.conditional_edge.dest
|
||||||
|
if next_template_key is not None:
|
||||||
|
next_graph_node, _ = graph_node_out_edge_dict["conditional"] if graph_node_out_edge_dict["conditional"] else (None, None)
|
||||||
|
if next_graph_node is not None or current_template_node.conditional_edge.commit_none_to_mapping:
|
||||||
|
dfs_stack.append((next_template_key, next_graph_node))
|
||||||
|
|
||||||
|
# extend along the exception edge
|
||||||
|
if current_template_node.exception_edge:
|
||||||
|
next_template_key = current_template_node.exception_edge.dest
|
||||||
|
if next_template_key is not None:
|
||||||
|
next_graph_node, _ = graph_node_out_edge_dict["exception"] if graph_node_out_edge_dict["exception"] else (None, None)
|
||||||
|
if next_graph_node is not None or current_template_node.exception_edge.commit_none_to_mapping:
|
||||||
|
dfs_stack.append((next_template_key, next_graph_node))
|
||||||
|
|
||||||
|
# we have a final mapping, check any top-level verification stuff
|
||||||
|
if self.mapping_verification_func and not self.mapping_verification_func(cfg, mapping):
|
||||||
|
return None
|
||||||
|
|
||||||
|
# mapping was successful
|
||||||
|
if cfg == original_cfg:
|
||||||
|
return mapping
|
||||||
|
|
||||||
|
# commit changes to the original cfg by modifying the reference
|
||||||
|
original_cfg.clear()
|
||||||
|
original_cfg.update(cfg)
|
||||||
|
return mapping
|
||||||
+5
@@ -0,0 +1,5 @@
|
|||||||
|
# to make a base template for exception blocks
|
||||||
|
|
||||||
|
|
||||||
|
class AbstractExceptionBlockTemplate:
|
||||||
|
pass
|
||||||
+5
@@ -0,0 +1,5 @@
|
|||||||
|
# to make a base template to deal with end finallys so we can whitelist templates
|
||||||
|
|
||||||
|
|
||||||
|
class AbstractNonSequentiable:
|
||||||
|
pass
|
||||||
+42
@@ -0,0 +1,42 @@
|
|||||||
|
from abc import ABC, abstractmethod
|
||||||
|
|
||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode import Inst
|
||||||
|
|
||||||
|
|
||||||
|
class ControlFlowTemplate(ABC):
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _indent_multiline_string(multiline_string: str, indentation_level: int = 1) -> str:
|
||||||
|
return "\n".join("\t" * indentation_level + line.rstrip() for line in multiline_string.split("\n") if line)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
name = f"{type(self).__name__}"
|
||||||
|
components = ControlFlowTemplate._indent_multiline_string(",\n".join(f"{key}={repr(value)}" for key, value in vars(self).items()))
|
||||||
|
return f"{name}[\n{components}]"
|
||||||
|
|
||||||
|
def get_instructions(self) -> list[Inst]:
|
||||||
|
insts: list[Inst] = []
|
||||||
|
for key, value in vars(self).items():
|
||||||
|
if hasattr(value, "get_instructions"):
|
||||||
|
insts.extend(value.get_instructions())
|
||||||
|
elif isinstance(value, Inst):
|
||||||
|
insts.append(value)
|
||||||
|
return insts
|
||||||
|
return sorted(insts, key=lambda i: i.offset)
|
||||||
+321
@@ -0,0 +1,321 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_node_has_no_backwards_edges, node_match_all, assert_no_linestarts
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ChainedComparisonTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A chained comparison such as a == b == c.
|
||||||
|
(0)
|
||||||
|
/ \\j
|
||||||
|
(1) (2) (0123)
|
||||||
|
/ \\j/j / \\j
|
||||||
|
(3) (5) --> (4) (5)
|
||||||
|
|j
|
||||||
|
(4)
|
||||||
|
|
||||||
|
not (a == b == c)
|
||||||
|
|
||||||
|
(0)
|
||||||
|
j/ \\
|
||||||
|
(2) (1) --> (0123)
|
||||||
|
/ / \\j / \\j
|
||||||
|
| (3) (5) (4) (5)
|
||||||
|
| /j
|
||||||
|
(4)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
This condenses the chained comparison down to be matched against an if-like template later
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_node_has_no_backwards_edges,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="j2if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="cleanup",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="cleanup",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"j2if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="j2if_body",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="j2if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
_subgraph2 = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_node_has_no_backwards_edges,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="j2if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="cleanup",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="j2if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"j2if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="j2if_body",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="j2if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, first_condition: ControlFlowTemplate, second_condition: ControlFlowTemplate, cleanup: ControlFlowTemplate, j2if_body: ControlFlowTemplate):
|
||||||
|
self.first_condition = first_condition
|
||||||
|
self.second_condition = second_condition
|
||||||
|
self.cleanup = cleanup
|
||||||
|
self.j2if_body = j2if_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ChainedComparisonTemplate._subgraph, root_key="first_condition", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ChainedComparisonTemplate._subgraph2, root_key="first_condition", mapping_verification_func=None)
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
chained_comparison_template = ChainedComparisonTemplate(first_condition=mapping["first_condition"], second_condition=mapping["second_condition"], cleanup=mapping["cleanup"], j2if_body=mapping["j2if_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, chained_comparison_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(chained_comparison_template, mapping["if_body"], {"type": ControlFlowEdgeType.NATURAL.value}), (chained_comparison_template, mapping["tail"], {"type": ControlFlowEdgeType.TRUE_JUMP.value})]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([chained_comparison_template.first_condition, chained_comparison_template.second_condition, chained_comparison_template.cleanup, chained_comparison_template.j2if_body])
|
||||||
|
reduced_cfg.add_node(chained_comparison_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
first_condition = self.first_condition.to_indented_source(source_lines)
|
||||||
|
second_condition = self.second_condition.to_indented_source(source_lines)
|
||||||
|
return "\n".join([first_condition, second_condition])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+274
@@ -0,0 +1,274 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..loop.LoopExitTemplate import LoopExitTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_node_has_no_backwards_edges, node_match_all, assert_no_linestarts, assert_node_type, node_match_none
|
||||||
|
|
||||||
|
from ..loop.PreRefinedLoopTemplate import PreRefinedLoopTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class ShortCircuitAndTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A short-circuit evaluated boolean AND. Typically these are all part of one line.
|
||||||
|
(0)
|
||||||
|
/ \\ (01)
|
||||||
|
(1) |j --> / \\j
|
||||||
|
|\\j| (2) (3)
|
||||||
|
(2) (3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
This condenses the short-circuit down to be matched against an if-like template later
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_node_has_no_backwards_edges,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
node_match_none(assert_node_type(PreRefinedLoopTemplate)),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
_subgraph_for_loop_exits = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_node_has_no_backwards_edges,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="first_loop_exit_tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_no_linestarts,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="second_loop_exit_tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"first_loop_exit_tail": TemplateNode(
|
||||||
|
node_verification_func=assert_node_type(LoopExitTemplate),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="first_loop_exit_tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_loop_exit_tail": TemplateNode(
|
||||||
|
node_verification_func=assert_node_type(LoopExitTemplate),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_loop_exit_tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def _verify_loop_exit_match(cfg: nx.DiGraph, mapping: dict) -> bool:
|
||||||
|
first_tail = mapping["first_loop_exit_tail"]
|
||||||
|
second_tail = mapping["second_loop_exit_tail"]
|
||||||
|
if not isinstance(first_tail, LoopExitTemplate) or not isinstance(second_tail, LoopExitTemplate):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# the loop exits should have no code associated with them
|
||||||
|
# this part of the pattern is just to deal with implicit continues that got split into separate nodes
|
||||||
|
if first_tail.tail or second_tail.tail:
|
||||||
|
return False
|
||||||
|
|
||||||
|
return first_tail.exit_statement == second_tail.exit_statement
|
||||||
|
|
||||||
|
def __init__(self, first_condition: ControlFlowTemplate, second_condition: ControlFlowTemplate):
|
||||||
|
self.first_condition = first_condition
|
||||||
|
self.second_condition = second_condition
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ShortCircuitAndTemplate._subgraph, root_key="first_condition", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ShortCircuitAndTemplate._subgraph_for_loop_exits, root_key="first_condition", mapping_verification_func=ShortCircuitAndTemplate._verify_loop_exit_match)
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
short_circuit_template = ShortCircuitAndTemplate(
|
||||||
|
first_condition=mapping["first_condition"],
|
||||||
|
second_condition=mapping["second_condition"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, short_circuit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(short_circuit_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(short_circuit_template.second_condition, data=True)]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([short_circuit_template.first_condition, short_circuit_template.second_condition])
|
||||||
|
if first_loop_exit_tail := mapping.get("first_loop_exit_tail", None):
|
||||||
|
reduced_cfg.remove_node(first_loop_exit_tail)
|
||||||
|
reduced_cfg.add_node(short_circuit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
first_condition = self.first_condition.to_indented_source(source_lines)
|
||||||
|
second_condition = self.second_condition.to_indented_source(source_lines)
|
||||||
|
return "\n".join([first_condition, second_condition])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+188
@@ -0,0 +1,188 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_dominator_function
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_unconditional_jump
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ShortCircuitOrContinueTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A short-circuit evaluated boolean OR. Typically these are all part of one line.
|
||||||
|
This variant occurs only when the content of the if statement is just a continue.
|
||||||
|
|
||||||
|
(-1)
|
||||||
|
| \\
|
||||||
|
(0) \\
|
||||||
|
/ \\j | (-10)
|
||||||
|
(1) (2) |j --> / \\j
|
||||||
|
\\j / (1) (2)
|
||||||
|
(3) -/ \\j
|
||||||
|
(3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
This condenses the short-circuit down to be matched against an if-like template later
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="loop_header",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_unconditional_jump, # this is the continue statement
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="loop_header",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, first_condition: ControlFlowTemplate, second_condition: ControlFlowTemplate):
|
||||||
|
self.first_condition = first_condition
|
||||||
|
self.second_condition = second_condition
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cfg.in_degree(node) != 1:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# to avoid being treated as an if-else, we actually need to greedily search up one layer
|
||||||
|
pred = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
def verify_loop_header(cfg: nx.DiGraph, mapping: dict[str, ControlFlowTemplate]) -> bool:
|
||||||
|
# check to make sure that all non-stack/control instructions match between the two finally blocks
|
||||||
|
# this list was made for 3.9, so it may need to be expanded for other versions
|
||||||
|
dominates = get_dominator_function(cfg)
|
||||||
|
return dominates(mapping["loop_header"], mapping["first_condition"])
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ShortCircuitOrContinueTemplate._subgraph, root_key="first_condition", mapping_verification_func=verify_loop_header)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, pred)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
short_circuit_template = ShortCircuitOrContinueTemplate(
|
||||||
|
first_condition=mapping["first_condition"],
|
||||||
|
second_condition=mapping["second_condition"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, short_circuit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=short_circuit_template.first_condition, data=True))
|
||||||
|
out_edges = [(short_circuit_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(short_circuit_template.second_condition, data=True)]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([short_circuit_template.first_condition, short_circuit_template.second_condition])
|
||||||
|
reduced_cfg.add_node(short_circuit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
first_condition = self.first_condition.to_indented_source(source_lines)
|
||||||
|
second_condition = self.second_condition.to_indented_source(source_lines)
|
||||||
|
return "\n".join([first_condition, second_condition])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+157
@@ -0,0 +1,157 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ShortCircuitOrTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A short-circuit evaluated boolean OR. Typically these are all part of one line.
|
||||||
|
(0)
|
||||||
|
/ \\ (01)
|
||||||
|
(1) |j --> / \\j
|
||||||
|
|j\\| (2) (3)
|
||||||
|
(2) (3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
This condenses the short-circuit down to be matched against an if-like template later
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"first_condition": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="second_condition",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="first_condition",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"second_condition": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="second_condition",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, first_condition: ControlFlowTemplate, second_condition: ControlFlowTemplate):
|
||||||
|
self.first_condition = first_condition
|
||||||
|
self.second_condition = second_condition
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ShortCircuitOrTemplate._subgraph, root_key="first_condition", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
short_circuit_template = ShortCircuitOrTemplate(
|
||||||
|
first_condition=mapping["first_condition"],
|
||||||
|
second_condition=mapping["second_condition"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, short_circuit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(short_circuit_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(short_circuit_template.second_condition, data=True)]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([short_circuit_template.first_condition, short_circuit_template.second_condition])
|
||||||
|
reduced_cfg.add_node(short_circuit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
first_condition = self.first_condition.to_indented_source(source_lines)
|
||||||
|
second_condition = self.second_condition.to_indented_source(source_lines)
|
||||||
|
return "\n".join([first_condition, second_condition])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+130
@@ -0,0 +1,130 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_edge, assert_in_degree, node_match_all, is_exactly_opname, contains_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
class AsyncWithCleanup312(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"start": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("PUSH_EXC_INFO", "WITH_EXCEPT_START", "GET_AWAITABLE", "LOAD_CONST"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="start",
|
||||||
|
dest="send",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="start", dest="exc"),
|
||||||
|
),
|
||||||
|
"send": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("SEND"), assert_in_degree(2)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="yield",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="ifthen",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="exc",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"yield": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("YIELD_VALUE"), assert_in_degree(1)), natural_edge=TemplateEdge(source="yield", dest="jump_back"), exception_edge=TemplateEdge(source="yield", dest="ifthen")
|
||||||
|
),
|
||||||
|
"jump_back": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("JUMP_BACKWARD_NO_INTERRUPT"), assert_in_degree(1)), natural_edge=TemplateEdge(source="jump_back", dest="send"), exception_edge=TemplateEdge(source="jump_back", dest="exc")
|
||||||
|
),
|
||||||
|
"ifthen": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("CLEANUP_THROW", "END_SEND", "POP_JUMP_IF_TRUE", "RERAISE", "POP_TOP"), assert_in_degree(2)),
|
||||||
|
natural_edge=TemplateEdge(source="ifthen", dest="tail"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="ifthen",
|
||||||
|
dest="exc",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exc": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"), assert_in_degree(4)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(contains_opname_sequence("POP_EXCEPT", "POP_TOP", "POP_TOP"), assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=AsyncWithCleanup312._subgraph,
|
||||||
|
root_key="start",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
with_template = AsyncWithCleanup312()
|
||||||
|
|
||||||
|
in_edges = ((src, with_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
# out_edges = ((with_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=mapping['exc'], data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(mapping.values())
|
||||||
|
reduced_cfg.add_node(with_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+136
@@ -0,0 +1,136 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_edge, assert_in_degree, assert_node_has_no_backwards_edges, node_match_all, is_exactly_opname, node_match_any
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
|
||||||
|
class Await312Template(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"awaited": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="awaited",
|
||||||
|
dest="send",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="awaited",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"send": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(2), assert_node_has_no_backwards_edges, is_exactly_opname("SEND")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="yield",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="jump_back",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="send",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"yield": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), assert_node_has_no_backwards_edges, is_exactly_opname("YIELD_VALUE")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="yield",
|
||||||
|
dest="jump_back",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="yield",
|
||||||
|
dest="cleanup_throw",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"jump_back": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(2), is_exactly_opname("JUMP_BACKWARD_NO_INTERRUPT")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="jump_back",
|
||||||
|
dest="send",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="jump_back",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"cleanup_throw": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), is_exactly_opname("CLEANUP_THROW")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="cleanup_throw",
|
||||||
|
dest="jump_back2",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="cleanup_throw",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"jump_back2": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), node_match_any(is_exactly_opname("JUMP_BACKWARD"), is_exactly_opname("JUMP_BACKWARD_NO_INTERRUPT"))),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="jump_back2",
|
||||||
|
dest=None,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, awaited):
|
||||||
|
self.awaited = awaited
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=Await312Template._subgraph, root_key="awaited", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = Await312Template(
|
||||||
|
awaited=mapping["awaited"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(template, next(cfg.successors(mapping["jump_back2"])), {"type": ControlFlowEdgeType.NATURAL.value}), (template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value})]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([template.awaited, mapping["send"], mapping["yield"], mapping["jump_back"], mapping["cleanup_throw"], mapping["jump_back2"]])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
return self.awaited.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+128
@@ -0,0 +1,128 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_edge, assert_in_degree, node_match_all, is_exactly_opname, contains_opname_sequence, node_match_any
|
||||||
|
|
||||||
|
|
||||||
|
class WithCleanup312(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
_subgraph = {
|
||||||
|
"start": TemplateNode(
|
||||||
|
node_verification_func=node_match_any(
|
||||||
|
is_exactly_opname("PUSH_EXC_INFO", "WITH_EXCEPT_START", "POP_JUMP_IF_TRUE"),
|
||||||
|
is_exactly_opname("PUSH_EXC_INFO", "WITH_EXCEPT_START", "TO_BOOL", "POP_JUMP_IF_TRUE"),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="start",
|
||||||
|
dest="reraise",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="start",
|
||||||
|
dest="poptop",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="start", dest="exc"),
|
||||||
|
),
|
||||||
|
"reraise": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("RERAISE"), assert_in_degree(1)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="reraise",
|
||||||
|
dest="exc",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"poptop": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("POP_TOP"), assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="poptop",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="poptop",
|
||||||
|
dest="exc",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exc": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"), assert_in_degree(3)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exc",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(contains_opname_sequence("POP_EXCEPT", "POP_TOP", "POP_TOP"), assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
# to avoid being treated as an try-except, we actually need to greedily search up one layer
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=WithCleanup312._subgraph,
|
||||||
|
root_key="start",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
with_template = WithCleanup312()
|
||||||
|
|
||||||
|
in_edges = ((src, with_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(mapping.values())
|
||||||
|
reduced_cfg.add_node(with_template)
|
||||||
|
reduced_cfg.add_edges_from(in_edges)
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+177
@@ -0,0 +1,177 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_with, node_match_all
|
||||||
|
from ..subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class WithTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
r"""
|
||||||
|
|
||||||
|
A basic with template as a catch for normal withs
|
||||||
|
|
||||||
|
(0) node 2 may point to an outer exception handler
|
||||||
|
|
|
||||||
|
(1)
|
||||||
|
e/ |
|
||||||
|
/ (2)
|
||||||
|
\ |
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_with": TemplateNode(
|
||||||
|
node_verification_func=assert_with,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest="body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest="begin_finally",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False, # since it is possible to not have a begin finally block we need to commit it to mapping
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="body", dest="with_cleanup"),
|
||||||
|
),
|
||||||
|
"begin_finally": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
optional_node,
|
||||||
|
assert_in_degree(1),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="with_cleanup",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False, # if the destination node is None, don't commit to the mapping
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="begin_finally", dest="exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"with_cleanup": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="with_cleanup",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="with_cleanup",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_with: ControlFlowTemplate, body: ControlFlowTemplate, begin_finally: ControlFlowTemplate, with_cleanup: ControlFlowTemplate):
|
||||||
|
self.setup_with = setup_with
|
||||||
|
self.body = body
|
||||||
|
self.begin_finally = begin_finally
|
||||||
|
self.with_cleanup = with_cleanup
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
# to avoid being treated as an try-except, we actually need to greedily search up one layer
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=WithTemplate._subgraph,
|
||||||
|
root_key="setup_with",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
with_template = WithTemplate(
|
||||||
|
setup_with=mapping["setup_with"],
|
||||||
|
body=mapping["body"],
|
||||||
|
begin_finally=mapping.get("begin_finally", None),
|
||||||
|
with_cleanup=mapping["with_cleanup"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, with_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["tail"]:
|
||||||
|
out_edges.append((with_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
else:
|
||||||
|
out_edges.extend([(with_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=mapping["with_cleanup"], data=True)])
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((with_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([with_template.setup_with, with_template.body, with_template.begin_finally, with_template.with_cleanup])
|
||||||
|
reduced_cfg.add_node(with_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
header = self.setup_with.to_indented_source(source_lines)
|
||||||
|
body = self.body._indent_multiline_string(self.body.to_indented_source(source_lines))
|
||||||
|
# cleanup = self.with_cleanup.to_indented_source(source_lines)
|
||||||
|
return f"{header}\n{body}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+124
@@ -0,0 +1,124 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
from .WithCleanup312 import WithCleanup312
|
||||||
|
from .AsyncWithCleanup312 import AsyncWithCleanup312
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_with, node_match_all, assert_node_type
|
||||||
|
|
||||||
|
|
||||||
|
class WithTemplate312(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
_subgraph = {
|
||||||
|
"setup_with": TemplateNode(
|
||||||
|
node_verification_func=assert_with,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest="body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="body", dest="with_cleanup2", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest="with_cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"with_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), assert_node_type(WithCleanup312, AsyncWithCleanup312)),
|
||||||
|
),
|
||||||
|
"with_cleanup2": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="with_cleanup2",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="with_cleanup2",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="with_cleanup2",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_with: ControlFlowTemplate, body: ControlFlowTemplate, with_cleanup: ControlFlowTemplate, with_cleanup2: ControlFlowTemplate):
|
||||||
|
self.setup_with = setup_with
|
||||||
|
self.body = body
|
||||||
|
self.with_cleanup = with_cleanup
|
||||||
|
self.with_cleanup2 = with_cleanup2
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=WithTemplate312._subgraph,
|
||||||
|
root_key="setup_with",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
with_template = WithTemplate312(
|
||||||
|
setup_with=mapping["setup_with"],
|
||||||
|
body=mapping["body"],
|
||||||
|
with_cleanup=mapping["with_cleanup"],
|
||||||
|
with_cleanup2=mapping.get("with_cleanup2"),
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, with_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
if "with_cleanup2" in mapping:
|
||||||
|
out_edges = (
|
||||||
|
(with_template, dst, {"type": ControlFlowEdgeType.NATURAL.value} if edge_properties["type"] == ControlFlowEdgeType.JUMP.value else edge_properties)
|
||||||
|
for src, dst, edge_properties in cfg.out_edges(nbunch=mapping["with_cleanup2"], data=True)
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
out_edges = ()
|
||||||
|
out_edges2 = ((with_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=mapping["setup_with"], data=True) if edge_properties["type"] == ControlFlowEdgeType.EXCEPTION.value)
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([with_template.setup_with, with_template.body, with_template.with_cleanup, with_template.with_cleanup2])
|
||||||
|
reduced_cfg.add_node(with_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges, out_edges2))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
header = self.setup_with.to_indented_source(source_lines)
|
||||||
|
body = self._indent_multiline_string(self.body.to_indented_source(source_lines))
|
||||||
|
if self.with_cleanup2 is not None:
|
||||||
|
clean = self.with_cleanup2.to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
clean = ""
|
||||||
|
return f"{header}\n{body}\n{clean}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+115
@@ -0,0 +1,115 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..try_except.TryExceptTemplate import TryExceptTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_with, node_match_all, assert_node_type
|
||||||
|
|
||||||
|
|
||||||
|
class WithTemplate39(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
_subgraph = {
|
||||||
|
"setup_with": TemplateNode(
|
||||||
|
node_verification_func=assert_with,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest="body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="setup_with",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"body": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), assert_node_type(TryExceptTemplate)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_with: ControlFlowTemplate, body: ControlFlowTemplate):
|
||||||
|
self.setup_with = setup_with
|
||||||
|
self.body = body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=WithTemplate39._subgraph,
|
||||||
|
root_key="setup_with",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
with_template = WithTemplate39(
|
||||||
|
setup_with=mapping["setup_with"],
|
||||||
|
body=mapping["body"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, with_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = ((with_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=mapping["body"], data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([with_template.setup_with, with_template.body])
|
||||||
|
reduced_cfg.add_node(with_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
__import__("pdb").set_trace()
|
||||||
|
header = self.setup_with.to_indented_source(source_lines)
|
||||||
|
body = self._indent_multiline_string(self.body.try_body.to_indented_source(source_lines))
|
||||||
|
return f"{header}\n{body}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+180
@@ -0,0 +1,180 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class ElseExitExceptTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
An if-else block where only the else has no further control flow (structured breaks/continues and returns).
|
||||||
|
When the exit leaves an exception block, the final exit statement does not have the same exception handler.
|
||||||
|
(0)
|
||||||
|
j/ \\ --> (0123)
|
||||||
|
(1) (2) |
|
||||||
|
| |j (...)
|
||||||
|
(3) (...)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="else_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exit_node",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exit_node": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exit_node",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate, else_body: ControlFlowTemplate, exit_node: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.else_body = else_body
|
||||||
|
self.exit_node = exit_node
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ElseExitExceptTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_else_template = ElseExitExceptTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"], else_body=mapping["else_body"], exit_node=mapping["exit_node"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_else_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(if_else_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_else_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_else_template.if_header, if_else_template.if_body, if_else_template.else_body, if_else_template.exit_node])
|
||||||
|
reduced_cfg.add_node(if_else_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
else_body = ControlFlowTemplate._indent_multiline_string(self.else_body.to_indented_source(source_lines))
|
||||||
|
exit_node = ControlFlowTemplate._indent_multiline_string(self.exit_node.to_indented_source(source_lines))
|
||||||
|
return "\n".join([header, if_body, "else: # inserted", else_body, exit_node])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+144
@@ -0,0 +1,144 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class ElseExitTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An if-else block where only the else has no further control flow (structured breaks/continues and returns).
|
||||||
|
(0)
|
||||||
|
j/ \\ --> (012)
|
||||||
|
(1) (2) |
|
||||||
|
|j (3)
|
||||||
|
(3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="else_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate, else_body: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.else_body = else_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ElseExitTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_else_template = ElseExitTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"], else_body=mapping["else_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_else_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(if_else_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_else_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_else_template.if_header, if_else_template.if_body, if_else_template.else_body])
|
||||||
|
reduced_cfg.add_node(if_else_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
else_body = ControlFlowTemplate._indent_multiline_string(self.else_body.to_indented_source(source_lines))
|
||||||
|
return "\n".join([header, if_body, "else: # inserted", else_body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+148
@@ -0,0 +1,148 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
from ..try_except.ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
from ..try_except.ExceptAsExceptTemplate import ExceptAsExceptTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptExitTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
A `try-except` block where the except has no further control flow.
|
||||||
|
(0)
|
||||||
|
/ \\e --> (012)
|
||||||
|
(1) (2) |
|
||||||
|
| (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="try_footer",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_footer": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="try_footer", dest="after_try_except", edge_verification_func=assert_edge_type(ControlFlowEdgeType.JUMP)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_body: ControlFlowTemplate, try_footer: ControlFlowTemplate, except_body: ControlFlowTemplate):
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_footer = try_footer
|
||||||
|
self.except_body = except_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptExitTemplate._subgraph, root_key="try_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try_except_template = ExceptExitTemplate(try_body=mapping["try_body"], try_footer=mapping["try_footer"], except_body=mapping["except_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the try except
|
||||||
|
out_edges = [(try_except_template, mapping["after_try_except"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_body, try_except_template.try_footer, try_except_template.except_body])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_body = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
try_except_lines = ["try:", try_body]
|
||||||
|
# if we matched against an "Except ... as" chain, then omit the inserted except: block
|
||||||
|
if isinstance(self.except_body, ExceptAsTemplate) or isinstance(self.except_body, ExceptAsExceptTemplate):
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
except_body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
try_except_lines.append("except:")
|
||||||
|
try_except_lines.append(except_body)
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+185
@@ -0,0 +1,185 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, node_is_none_or_matches, edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
class IfElseExitTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An if-else block where both options have no further control flow (structured breaks/continues and returns).
|
||||||
|
(0)
|
||||||
|
j/ \\ --> (012)
|
||||||
|
(1) (2)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
nodes 1 and 2 can optionally have a "tail" that is an exit statement that breaks out of the current exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="else_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="if_tail",
|
||||||
|
edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_tail": TemplateNode(
|
||||||
|
node_verification_func=node_is_none_or_matches(assert_in_degree(1)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_tail",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="else_tail",
|
||||||
|
edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_tail": TemplateNode(
|
||||||
|
node_verification_func=node_is_none_or_matches(assert_in_degree(1)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_tail",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
if_header: ControlFlowTemplate,
|
||||||
|
if_body: ControlFlowTemplate,
|
||||||
|
if_tail: ControlFlowTemplate,
|
||||||
|
else_body: ControlFlowTemplate,
|
||||||
|
else_tail: ControlFlowTemplate,
|
||||||
|
):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.if_tail = if_tail # may be none
|
||||||
|
self.else_body = else_body
|
||||||
|
self.else_tail = else_tail # may be none
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfElseExitTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_else_template = IfElseExitTemplate(
|
||||||
|
if_header=mapping["if_header"],
|
||||||
|
if_body=mapping["if_body"],
|
||||||
|
if_tail=mapping["if_tail"],
|
||||||
|
else_body=mapping["else_body"],
|
||||||
|
else_tail=mapping["else_tail"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, if_else_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
# only preserve meta edges
|
||||||
|
out_edges = [(if_else_template, "END", data) for _, _, data in cfg.out_edges([if_else_template.if_body, if_else_template.else_body], data=True) if data["type"] == ControlFlowEdgeType.META.value]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_else_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_else_template.if_header, if_else_template.if_body, if_else_template.if_tail, if_else_template.else_body, if_else_template.else_tail])
|
||||||
|
reduced_cfg.add_node(if_else_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
if_body = self.if_body.to_indented_source(source_lines)
|
||||||
|
if header.strip():
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(if_body)
|
||||||
|
else_body = self.else_body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
return "\n".join([header, if_body, else_body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+162
@@ -0,0 +1,162 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_out_edge_dict, ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class IfExitExceptTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An if block where the if has no further control flow (structured breaks/continues and returns).
|
||||||
|
When the exit leaves an exception block, the final exit statement does not have the same exception handler.
|
||||||
|
|
||||||
|
(0)
|
||||||
|
j/ \\ --> (023)
|
||||||
|
(1) (2) |
|
||||||
|
| | (1)
|
||||||
|
... (3)
|
||||||
|
|
||||||
|
In this configuration, (0,1,2) share an exception handler, but 3 does not
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exit_node",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exit_node": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exit_node",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate, exit_node: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.exit_node = exit_node
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# try to match happy non-exception version
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfExitExceptTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_exit_template = IfExitExceptTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"], exit_node=mapping["exit_node"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
node_edge_dict = get_out_edge_dict(cfg, node)
|
||||||
|
out_edges = [(if_exit_template, node_edge_dict["conditional"][0], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if node_edge_dict["exception"]:
|
||||||
|
out_edges.append((if_exit_template, *(node_edge_dict["exception"])))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_exit_template.if_header, if_exit_template.if_body, if_exit_template.exit_node])
|
||||||
|
reduced_cfg.add_node(if_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
exit_node = ControlFlowTemplate._indent_multiline_string(self.exit_node.to_indented_source(source_lines))
|
||||||
|
return "\n".join([header, if_body, exit_node])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+133
@@ -0,0 +1,133 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_out_edge_dict, ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class IfExitTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An if block where the if has no further control flow (structured breaks/continues and returns).
|
||||||
|
(0)
|
||||||
|
j/ \\ --> (02)
|
||||||
|
(1) (2) |
|
||||||
|
| (1)
|
||||||
|
...
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfExitTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_exit_template = IfExitTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
node_edge_dict = get_out_edge_dict(cfg, node)
|
||||||
|
out_edges = [(if_exit_template, node_edge_dict["conditional"][0], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if node_edge_dict["exception"]:
|
||||||
|
out_edges.append((if_exit_template, *(node_edge_dict["exception"])))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_exit_template.if_header, if_exit_template.if_body])
|
||||||
|
reduced_cfg.add_node(if_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
if_body = self.if_body.to_indented_source(source_lines)
|
||||||
|
# sometimes there is no header in the case of short-circuit boolean AND
|
||||||
|
if header:
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(if_body)
|
||||||
|
return "\n".join([header, if_body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+139
@@ -0,0 +1,139 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..try_except.ExceptAsExceptTemplate import ExceptAsExceptTemplate
|
||||||
|
from ..try_except.ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
from ..try_except.ExceptAsExitTemplate import ExceptAsExitTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_instruction_opname, assert_except_as
|
||||||
|
|
||||||
|
|
||||||
|
class TryExitExceptExitTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
An try block where neither the try body nor the except body has no further control flow (structured breaks/continues and returns).
|
||||||
|
(0)
|
||||||
|
| --> (012)
|
||||||
|
(1)
|
||||||
|
|e
|
||||||
|
(2)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="try_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_finally: ControlFlowTemplate, try_body: ControlFlowTemplate, except_body: ControlFlowTemplate):
|
||||||
|
self.setup_finally = setup_finally
|
||||||
|
self.try_body = try_body
|
||||||
|
self.except_body = except_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# an except as exit looks exactly like this, so we need to check that we are not part of the larger pattern
|
||||||
|
def assert_not_in_except_as(cfg: nx.DiGraph, mapping: dict) -> bool:
|
||||||
|
setup_finally = mapping["setup_finally"]
|
||||||
|
if cfg.in_degree(setup_finally) != 1:
|
||||||
|
return True
|
||||||
|
|
||||||
|
pred = next(cfg.predecessors(setup_finally))
|
||||||
|
return not assert_except_as(cfg, pred)
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryExitExceptExitTemplate._subgraph, root_key="setup_finally", mapping_verification_func=assert_not_in_except_as)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try_exit_template = TryExitExceptExitTemplate(setup_finally=mapping["setup_finally"], try_body=mapping["try_body"], except_body=mapping["except_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_exit_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([try_exit_template.setup_finally, try_exit_template.try_body, try_exit_template.except_body])
|
||||||
|
reduced_cfg.add_node(try_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
setup_finally = self.setup_finally.to_indented_source(source_lines)
|
||||||
|
try_body = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
try_except_lines = [setup_finally, "try:", try_body]
|
||||||
|
# if we matched against an "Except ... as" chain, then omit the inserted except: block
|
||||||
|
if isinstance(self.except_body, ExceptAsTemplate) or isinstance(self.except_body, ExceptAsExceptTemplate) or isinstance(self.except_body, ExceptAsExitTemplate):
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
except_body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
try_except_lines.append("except:")
|
||||||
|
try_except_lines.append(except_body)
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+153
@@ -0,0 +1,153 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractExceptionTemplate
|
||||||
|
from ..try_except.ExceptAsExceptTemplate import ExceptAsExceptTemplate
|
||||||
|
from ..try_except.ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class TryExitTemplate(ControlFlowTemplate, AbstractExceptionTemplate):
|
||||||
|
"""
|
||||||
|
An try block where the try body has no further control flow (structured breaks/continues and returns).
|
||||||
|
(0)
|
||||||
|
e/ \\ --> (012)
|
||||||
|
(1) (2) |
|
||||||
|
| (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="try_exit",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_exit": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_exit",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="after_try_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_body: ControlFlowTemplate, try_exit: ControlFlowTemplate, except_body: ControlFlowTemplate):
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_exit = try_exit
|
||||||
|
self.except_body = except_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryExitTemplate._subgraph, root_key="try_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try_exit_template = TryExitTemplate(try_body=mapping["try_body"], try_exit=mapping["try_exit"], except_body=mapping["except_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_exit_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["after_try_except"]:
|
||||||
|
out_edges.append((try_exit_template, mapping["after_try_except"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([try_exit_template.try_body, try_exit_template.try_exit, try_exit_template.except_body])
|
||||||
|
reduced_cfg.add_node(try_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_body = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
try_exit = ControlFlowTemplate._indent_multiline_string(self.try_exit.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
try_except_lines = ["try:", try_body, try_exit]
|
||||||
|
# if we matched against an "Except ... as" chain, then omit the inserted except: block
|
||||||
|
if isinstance(self.except_body, ExceptAsTemplate) or isinstance(self.except_body, ExceptAsExceptTemplate):
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
except_body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
try_except_lines.append("except:")
|
||||||
|
try_except_lines.append(except_body)
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+115
@@ -0,0 +1,115 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge
|
||||||
|
|
||||||
|
|
||||||
|
class ConditionalExitTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A conditional exit within a line. Typically due to an assert statement.
|
||||||
|
(0)
|
||||||
|
j| --> (01)
|
||||||
|
(1)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"exit_header": TemplateNode(
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exit_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, exit_header: ControlFlowTemplate, tail: ControlFlowTemplate):
|
||||||
|
self.exit_header = exit_header
|
||||||
|
self.tail = tail
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ConditionalExitTemplate._subgraph, root_key="exit_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
conditional_exit_template = ConditionalExitTemplate(
|
||||||
|
exit_header=mapping["exit_header"],
|
||||||
|
tail=mapping["tail"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, conditional_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = ((conditional_exit_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=mapping["tail"], data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([conditional_exit_template.exit_header, conditional_exit_template.tail])
|
||||||
|
reduced_cfg.add_node(conditional_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.exit_header.to_indented_source(source_lines)
|
||||||
|
tail = self.tail.to_indented_source(source_lines)
|
||||||
|
return "\n".join([header, tail])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+194
@@ -0,0 +1,194 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..natural.InstructionTemplate import InstructionTemplate
|
||||||
|
|
||||||
|
from ..subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, node_match_all, assert_node_has_no_backwards_edges, node_match_none, assert_except_as, is_exactly_opname
|
||||||
|
|
||||||
|
from ..natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class IfElseTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A standard if-else-block with no extra control flow.
|
||||||
|
(0)
|
||||||
|
j/ \\ (012)
|
||||||
|
(1) (2) --> |
|
||||||
|
\\ /j (3)
|
||||||
|
(3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
|
||||||
|
Interestingly, this template also covers loops with guaranteed breaks and an else block.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
node_verification_func=node_match_none(assert_except_as, is_exactly_opname("CLEANUP_THROW", "END_SEND", "POP_JUMP_IF_TRUE")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="else_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate, else_body: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.else_body = else_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfElseTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_else_template = IfElseTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"], else_body=mapping["else_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_else_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
if "tail" in mapping:
|
||||||
|
out_edges = [(if_else_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
else:
|
||||||
|
out_edges = []
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_else_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_else_template.if_header, if_else_template.if_body, if_else_template.else_body])
|
||||||
|
reduced_cfg.add_node(if_else_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
if_lines = []
|
||||||
|
header = self.if_header.to_indented_source(source_lines).rstrip()
|
||||||
|
if header and header.split("\n")[-1].strip().startswith("assert "):
|
||||||
|
return "\n".join([header, self.if_body.to_indented_source(source_lines), self.else_body.to_indented_source(source_lines)])
|
||||||
|
if header:
|
||||||
|
if_lines.append(header)
|
||||||
|
if_body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
if if_body:
|
||||||
|
if_lines.append(if_body)
|
||||||
|
else_body = ControlFlowTemplate._indent_multiline_string(self.else_body.to_indented_source(source_lines))
|
||||||
|
if else_body:
|
||||||
|
if_lines.extend(["else: # inserted", else_body])
|
||||||
|
|
||||||
|
# edge case hack to deal with for loops that have guaranteed breaks (they look exactly like if statements)
|
||||||
|
# while loops should be translated as if statements in this case, so we don't have to worry there
|
||||||
|
if isinstance(self.if_header, LinearSequenceTemplate):
|
||||||
|
last_member = self.if_header.members[-1]
|
||||||
|
if isinstance(last_member, InstructionTemplate) and last_member.instruction.opname == "FOR_ITER":
|
||||||
|
if_lines.insert(2, "\tbreak # inserted")
|
||||||
|
|
||||||
|
return "\n".join(if_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+162
@@ -0,0 +1,162 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..natural.InstructionTemplate import InstructionTemplate
|
||||||
|
from ..natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_unconditional_jump
|
||||||
|
|
||||||
|
|
||||||
|
class IfThenJumpTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A standard if-block with no extra control flow.
|
||||||
|
This variant has an absolute jump from the end of the if body to the outside.
|
||||||
|
This occurs when there are nested if-else blocks and the inner if statements jump out directly to the top level.
|
||||||
|
(0)
|
||||||
|
| \\ (01)
|
||||||
|
j| (1) --> |
|
||||||
|
| | (2)
|
||||||
|
| (2) |j
|
||||||
|
| /j (3)
|
||||||
|
(3)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="jump",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"jump": TemplateNode(
|
||||||
|
node_verification_func=assert_unconditional_jump,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="jump",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="jump",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfThenJumpTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_then_template = IfThenJumpTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_then_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(if_then_template, mapping["jump"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_then_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_then_template.if_header, if_then_template.if_body])
|
||||||
|
reduced_cfg.add_node(if_then_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines).strip()
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
if_lines = [header, body]
|
||||||
|
|
||||||
|
# edge case hack to deal with for loops that have guaranteed breaks (they look exactly like if statements)
|
||||||
|
# while loops should be translated as if statements in this case, so we don't have to worry there
|
||||||
|
if isinstance(self.if_header, LinearSequenceTemplate):
|
||||||
|
last_member = self.if_header.members[-1]
|
||||||
|
if isinstance(last_member, InstructionTemplate) and last_member.instruction.opname == "FOR_ITER":
|
||||||
|
if_lines.insert(2, "\tbreak # inserted")
|
||||||
|
|
||||||
|
return "\n".join(if_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+173
@@ -0,0 +1,173 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..natural.InstructionTemplate import InstructionTemplate
|
||||||
|
from ..natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
from ..subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_node_has_no_backwards_edges, node_match_all, assert_except_as, node_match_none
|
||||||
|
|
||||||
|
|
||||||
|
class IfThenTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A standard if-block with no extra control flow.
|
||||||
|
(0)
|
||||||
|
| \\ (01)
|
||||||
|
j| (1) --> |
|
||||||
|
| / (2)
|
||||||
|
(2)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
|
||||||
|
Interestingly, this template also covers loops with guaranteed breaks.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
node_verification_func=node_match_none(assert_except_as),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, if_body: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.if_body = if_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=IfThenTemplate._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if_then_template = IfThenTemplate(if_header=mapping["if_header"], if_body=mapping["if_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, if_then_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(if_then_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((if_then_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([if_then_template.if_header, if_then_template.if_body])
|
||||||
|
reduced_cfg.add_node(if_then_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.if_header.to_indented_source(source_lines).strip()
|
||||||
|
"""
|
||||||
|
if header.startswith('while ') and isinstance(self.if_body, RefinedLoopTemplate) and isinstance(self.if_body.loop_header, WhileTruePlaceholderTemplate):
|
||||||
|
if isinstance(self.if_body.loop_body, LinearSequenceTemplate):
|
||||||
|
last = self.if_body.loop_body.members[-1]
|
||||||
|
else:
|
||||||
|
last = self.if_body.loop_body
|
||||||
|
assert isinstance(last, IfElseTemplate)
|
||||||
|
last.to_indented_source = last.if_header.to_indented_source
|
||||||
|
self.if_body.loop_header.to_indented_source = lambda x: ''
|
||||||
|
if isinstance(self.if_body, LoopExitTemplate) and not header.startswith('if '):
|
||||||
|
body = ''
|
||||||
|
else:
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
"""
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.if_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
if_lines = [header, body]
|
||||||
|
|
||||||
|
# edge case hack to deal with for loops that have guaranteed breaks (they look exactly like if statements)
|
||||||
|
# while loops should be translated as if statements in this case, so we don't have to worry there
|
||||||
|
if isinstance(self.if_header, LinearSequenceTemplate):
|
||||||
|
last_member = self.if_header.members[-1]
|
||||||
|
if isinstance(last_member, InstructionTemplate) and last_member.instruction.opname == "FOR_ITER":
|
||||||
|
if_lines.insert(2, "\tbreak # inserted")
|
||||||
|
|
||||||
|
return "\n".join(if_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+152
@@ -0,0 +1,152 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_instruction_opname, node_match_all, assert_first_instruction_opname
|
||||||
|
|
||||||
|
|
||||||
|
class AsyncForTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An async for loop.
|
||||||
|
(-1)
|
||||||
|
| ^
|
||||||
|
(0) |j
|
||||||
|
| \\| (-101)
|
||||||
|
e| (1) --> |
|
||||||
|
| (2)
|
||||||
|
(2)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"loop_header": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="loop_iter",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_iter": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), assert_first_instruction_opname("GET_ANEXT")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_iter",
|
||||||
|
dest="loop_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_iter",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="loop_header",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, loop_header: ControlFlowTemplate, loop_iter: ControlFlowTemplate, loop_body: ControlFlowTemplate):
|
||||||
|
self.loop_header = loop_header
|
||||||
|
self.loop_iter = loop_iter
|
||||||
|
self.loop_body = loop_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cfg.in_degree(node) != 1:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# to avoid being treated as a try-except, we actually need to greedily search up one layer
|
||||||
|
pred = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=AsyncForTemplate._subgraph, root_key="loop_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, pred)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
loop_template = AsyncForTemplate(loop_header=mapping["loop_header"], loop_iter=mapping["loop_iter"], loop_body=mapping["loop_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, loop_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=pred, data=True) if src != mapping["loop_body"])
|
||||||
|
out_edges = [(loop_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((loop_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([loop_template.loop_header, loop_template.loop_iter, loop_template.loop_body])
|
||||||
|
reduced_cfg.add_node(loop_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.loop_header.to_indented_source(source_lines)
|
||||||
|
loop_iter = self.loop_iter.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.loop_body.to_indented_source(source_lines))
|
||||||
|
return "\n".join([header, loop_iter, body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+120
@@ -0,0 +1,120 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
from ..natural.InstructionTemplate import InstructionTemplate
|
||||||
|
from .LoopExitTemplate import LoopExitTemplate
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, node_match_all
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
|
||||||
|
def is_j(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return isinstance(node, LoopExitTemplate) and isinstance(node.tail, InstructionTemplate) and node.tail.instruction.opname == "JUMP_BACKWARD" and node.exit_statement == "continue" and node.tail.instruction.target.opname == "FOR_ITER"
|
||||||
|
|
||||||
|
|
||||||
|
class ForIf312Template(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"if_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="jump_back",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="real_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"jump_back": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), is_j),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="jump_back",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"real_body": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="real_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, if_header: ControlFlowTemplate, body: ControlFlowTemplate, jb: ControlFlowTemplate):
|
||||||
|
self.if_header = if_header
|
||||||
|
self.body = body
|
||||||
|
self.jb = jb
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ForIf312Template._subgraph, root_key="if_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = ForIf312Template(if_header=mapping["if_header"], body=mapping["real_body"], jb=mapping["jump_back"])
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge) for src, dst, edge in cfg.in_edges(node, data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([mapping["if_header"], mapping["real_body"], mapping["jump_back"]])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(in_edges)
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
reduced_cfg.add_edge(template, mapping["exception_handler"], type=ControlFlowEdgeType.EXCEPTION.value)
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
header = self.if_header.to_indented_source(source_lines)
|
||||||
|
body = self._indent_multiline_string(self.body.to_indented_source(source_lines))
|
||||||
|
"""
|
||||||
|
n = header.strip().split('\n')[-1].strip().startswith('if not ')
|
||||||
|
fj = self.if_header.get_instructions()[-1].opname == 'POP_JUMP_IF_FALSE'
|
||||||
|
breakpoint()
|
||||||
|
if fj != n:
|
||||||
|
header += '\n\tpass\nelse: # inserted'
|
||||||
|
"""
|
||||||
|
last = max((i.starts_line for i in self.if_header.get_instructions() if i.starts_line is not None), default=None)
|
||||||
|
if last is not None and last < len(source_lines) and body.split("\n")[0].strip() != source_lines[last].strip():
|
||||||
|
header += "\n\tpass\nelse: # inserted"
|
||||||
|
return header + "\n" + body
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+97
@@ -0,0 +1,97 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, node_match_all
|
||||||
|
|
||||||
|
|
||||||
|
def is_cleanup(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
insts = node.get_instructions()
|
||||||
|
if not insts or insts[-1].opname != "RERAISE":
|
||||||
|
return False
|
||||||
|
if [i.opname for i in insts[:3]] != ["SWAP", "POP_TOP", "SWAP"]:
|
||||||
|
return False
|
||||||
|
return all(i.opname == "STORE_FAST" for i in insts[3:-1])
|
||||||
|
|
||||||
|
|
||||||
|
class InlinedComprehensionTemplate(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"comp": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="comp",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="comp",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), is_cleanup),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="cleanup",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="tail", dest=None, edge_verification_func=optional_edge),
|
||||||
|
conditional_edge=TemplateEdge(source="tail", dest=None, edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(source="tail", dest="exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(source="exception_handler", dest=None, edge_verification_func=optional_edge),
|
||||||
|
conditional_edge=TemplateEdge(source="exception_handler", dest=None, edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(source="exception_handler", dest=None, edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, comp: ControlFlowTemplate, cleanup: ControlFlowTemplate):
|
||||||
|
self.comp = comp
|
||||||
|
self.cleanup = cleanup
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=InlinedComprehensionTemplate._subgraph, root_key="comp", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = InlinedComprehensionTemplate(comp=mapping["comp"], cleanup=mapping["cleanup"])
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = [(template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([template.comp, template.cleanup])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return ""
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..try_except.ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_out_edge_dict, ControlFlowEdgeType
|
||||||
|
|
||||||
|
|
||||||
|
class LoopExitTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A wrapper for identified break and continue statements.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, exit_statement: str, tail: ControlFlowTemplate = None):
|
||||||
|
self.tail = tail
|
||||||
|
self.exit_statement = exit_statement
|
||||||
|
assert self.exit_statement in ["break", "continue"]
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
raise NotImplementedError("Loop Exits do not have localized matching logic. These are assigned in refine_loops.")
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def structure_edge_inplace(cfg: nx.DiGraph, edge: tuple, exit_statment: str) -> None:
|
||||||
|
src, dst = edge
|
||||||
|
edge_properties = cfg.get_edge_data(src, dst)
|
||||||
|
|
||||||
|
cfg.remove_edge(src, dst)
|
||||||
|
# for an unconditional jump, integrate the tail into the exit template
|
||||||
|
if edge_properties.get("type", None) == ControlFlowEdgeType.JUMP.value:
|
||||||
|
template = LoopExitTemplate(exit_statement=exit_statment, tail=src)
|
||||||
|
nx.relabel_nodes(cfg, {src: template}, copy=False)
|
||||||
|
else:
|
||||||
|
template = LoopExitTemplate(exit_statement=exit_statment)
|
||||||
|
cfg.add_edge(src, template, **edge_properties)
|
||||||
|
src_exception_handler = get_out_edge_dict(cfg, src).get("exception")
|
||||||
|
if src_exception_handler != (None, None):
|
||||||
|
cfg.add_edge(template, src_exception_handler[0], type=ControlFlowEdgeType.EXCEPTION.value)
|
||||||
|
return template
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
tail_source = self.tail.to_indented_source(source_lines) + "\n" if self.tail else ""
|
||||||
|
|
||||||
|
exit_statement = self.exit_statement
|
||||||
|
if isinstance(self.tail, ExceptAsTemplate):
|
||||||
|
exit_statement = ControlFlowTemplate._indent_multiline_string(self.exit_statement)
|
||||||
|
|
||||||
|
return tail_source + exit_statement
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
@@ -0,0 +1,143 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType, get_dominator_function
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class LoopTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A natural non-infinite loop with no extra control flow.
|
||||||
|
(0)
|
||||||
|
| \\ (01)
|
||||||
|
j| (1) --> |
|
||||||
|
| (2)
|
||||||
|
(2)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"loop_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="loop_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="tail",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="loop_header",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, loop_header: ControlFlowTemplate, loop_body: ControlFlowTemplate):
|
||||||
|
self.loop_header = loop_header
|
||||||
|
self.loop_body = loop_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def verify_tail_not_in_loop(cfg: nx.DiGraph, mapping: dict) -> bool:
|
||||||
|
dominates = get_dominator_function(cfg)
|
||||||
|
# subgraph containing all nodes dominated by the loop header
|
||||||
|
dominated_subgraph: nx.DiGraph = cfg.subgraph(n for n in cfg.nodes if dominates(mapping["loop_header"], n))
|
||||||
|
reverse_reachability_map = nx.single_source_shortest_path_length(dominated_subgraph.reverse(), source=mapping["loop_header"])
|
||||||
|
# a node is in the loop if there is a backwards path to the header that doesn't leave the loop
|
||||||
|
loop_nodes = [loop_node for loop_node, distance in reverse_reachability_map.items() if distance >= 0]
|
||||||
|
return mapping["tail"] not in loop_nodes
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=LoopTemplate._subgraph, root_key="loop_header", mapping_verification_func=verify_tail_not_in_loop)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
loop_template = LoopTemplate(loop_header=mapping["loop_header"], loop_body=mapping["loop_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, loop_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True) if src != mapping["loop_body"])
|
||||||
|
out_edges = [(loop_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((loop_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([loop_template.loop_header, loop_template.loop_body])
|
||||||
|
reduced_cfg.add_node(loop_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.loop_header.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.loop_body.to_indented_source(source_lines))
|
||||||
|
return "\n".join([header, body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+57
@@ -0,0 +1,57 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_out_edge_dict
|
||||||
|
|
||||||
|
from ..placeholders.WhileTruePlaceholderTemplate import WhileTruePlaceholderTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class PreRefinedLoopTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
Matches a loop header for an unrefined loop containing breaks and continues.
|
||||||
|
Results in a RefinedLoopTemplate header and replaces all breaks and continues with LoopExitTemplates
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, loop_header: ControlFlowTemplate, loop_else: ControlFlowTemplate):
|
||||||
|
self.loop_header = loop_header
|
||||||
|
self.loop_else = loop_else
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
raise NotImplementedError("PreRefinedLoopTemplate does not have local matching logic. These are created in refine_loop")
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return self.loop_header.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def structure_nodes_inplace(cfg: nx.DiGraph, loop_header, canonical_loop_exit, loop_successor):
|
||||||
|
if not canonical_loop_exit:
|
||||||
|
# while true; use a placeholder that makes the while true "look like" a normal loop
|
||||||
|
loop_header = WhileTruePlaceholderTemplate.structure_node_inplace(cfg, loop_header, loop_successor)
|
||||||
|
loop_template = PreRefinedLoopTemplate(loop_header=loop_header, loop_else=None)
|
||||||
|
if canonical_loop_exit != loop_successor:
|
||||||
|
loop_template = PreRefinedLoopTemplate(loop_header=loop_header, loop_else=canonical_loop_exit)
|
||||||
|
else:
|
||||||
|
loop_template = PreRefinedLoopTemplate(loop_header=loop_header, loop_else=None)
|
||||||
|
|
||||||
|
in_edges = ((src, loop_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=loop_header, data=True))
|
||||||
|
out_edges = [(loop_template, loop_successor if dst == canonical_loop_exit else dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=loop_header, data=True)]
|
||||||
|
|
||||||
|
loop_header_out_dict = get_out_edge_dict(cfg, loop_header)
|
||||||
|
exception_target, edge_type = loop_header_out_dict["exception"]
|
||||||
|
if exception_target:
|
||||||
|
out_edges.append((loop_template, exception_target, edge_type))
|
||||||
|
|
||||||
|
cfg.remove_node(loop_template.loop_header)
|
||||||
|
if loop_template.loop_else:
|
||||||
|
cfg.remove_node(loop_template.loop_else)
|
||||||
|
cfg.add_node(loop_template)
|
||||||
|
cfg.add_edges_from(in_edges)
|
||||||
|
cfg.add_edges_from(out_edges)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+147
@@ -0,0 +1,147 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from .PreRefinedLoopTemplate import PreRefinedLoopTemplate
|
||||||
|
from ..placeholders.WhileTruePlaceholderTemplate import WhileTruePlaceholderTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class RefinedLoopTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
The second stage of matching loops with breaks an continues; matches fully-structured PreRefinedLoopTemplates.
|
||||||
|
(0) = PreRefinedLoopTemplate
|
||||||
|
// \\j --> (01)
|
||||||
|
(1) (2) |
|
||||||
|
(2)
|
||||||
|
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"pre_refined_loop": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="pre_refined_loop",
|
||||||
|
dest="loop_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="pre_refined_loop", dest="loop_successor", edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="pre_refined_loop",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"loop_successor": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_successor",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="loop_successor",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_successor",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, loop_header: ControlFlowTemplate, loop_body: ControlFlowTemplate, loop_else: ControlFlowTemplate, has_successor: bool = True):
|
||||||
|
self.loop_header = loop_header
|
||||||
|
self.loop_body = loop_body
|
||||||
|
self.loop_else = loop_else
|
||||||
|
self.has_successor = has_successor
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# this pattern is only for matching on PreRefinedLoops
|
||||||
|
if not isinstance(node, PreRefinedLoopTemplate):
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=RefinedLoopTemplate._subgraph, root_key="pre_refined_loop", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
loop_template = RefinedLoopTemplate(loop_header=mapping["pre_refined_loop"].loop_header, loop_body=mapping["loop_body"], loop_else=mapping["pre_refined_loop"].loop_else, has_successor=bool(mapping["loop_successor"]))
|
||||||
|
|
||||||
|
in_edges = ((src, loop_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["loop_successor"]:
|
||||||
|
out_edges.append((loop_template, mapping["loop_successor"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((loop_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([mapping["pre_refined_loop"], loop_template.loop_body])
|
||||||
|
reduced_cfg.add_node(loop_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
loop_lines = []
|
||||||
|
header = self.loop_header.to_indented_source(source_lines)
|
||||||
|
if not self.has_successor and not isinstance(self.loop_header, WhileTruePlaceholderTemplate):
|
||||||
|
header = ControlFlowTemplate._indent_multiline_string(header)
|
||||||
|
loop_lines.append("while True: # inserted")
|
||||||
|
loop_body = ControlFlowTemplate._indent_multiline_string(self.loop_body.to_indented_source(source_lines))
|
||||||
|
loop_lines.extend([header, loop_body])
|
||||||
|
if self.loop_else:
|
||||||
|
loop_else = ControlFlowTemplate._indent_multiline_string(self.loop_else.to_indented_source(source_lines))
|
||||||
|
loop_lines.extend(["else: # inserted", loop_else])
|
||||||
|
|
||||||
|
return "\n".join(loop_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
@@ -0,0 +1,85 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge
|
||||||
|
|
||||||
|
|
||||||
|
class SelfLoopTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
An infinite loop with no extra control flow.
|
||||||
|
(0)-< --> (0)
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"loop_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="loop_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, loop_body: ControlFlowTemplate):
|
||||||
|
self.loop_body = loop_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=SelfLoopTemplate._subgraph, root_key="loop_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
loop_template = SelfLoopTemplate(loop_body=mapping["loop_body"])
|
||||||
|
|
||||||
|
reduced_cfg: nx.DiGraph = nx.relabel_nodes(cfg, {mapping["loop_body"]: loop_template})
|
||||||
|
reduced_cfg.remove_edge(loop_template, loop_template)
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.loop_body.to_indented_source(source_lines))
|
||||||
|
return f"while True: # inserted\n{body}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+135
@@ -0,0 +1,135 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..if_then.IfElseTemplate import IfElseTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
class WhileTrueIfElseTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A while true that contains in if-else statement at the top level.
|
||||||
|
(0)
|
||||||
|
j| \\
|
||||||
|
(2) (1) --> (012)
|
||||||
|
|
||||||
|
nodes 1 and 2 have a backwards unconditional jump to 0
|
||||||
|
optionally, all nodes in the pattern can have a shared exception handler.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"loop_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="if_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="else_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="loop_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"if_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="loop_header",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="if_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="loop_header",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, loop_header: ControlFlowTemplate, if_body: ControlFlowTemplate, else_body: ControlFlowTemplate):
|
||||||
|
self.loop_header = loop_header
|
||||||
|
self.if_body = if_body
|
||||||
|
self.else_body = else_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=WhileTrueIfElseTemplate._subgraph, root_key="loop_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
loop_template = WhileTrueIfElseTemplate(
|
||||||
|
loop_header=mapping["loop_header"],
|
||||||
|
if_body=mapping["if_body"],
|
||||||
|
else_body=mapping["else_body"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, loop_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True) if src != mapping["if_body"] and src != mapping["else_body"])
|
||||||
|
out_edges = []
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((loop_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([loop_template.loop_header, loop_template.if_body, loop_template.else_body])
|
||||||
|
reduced_cfg.add_node(loop_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
|
||||||
|
if_else_template = IfElseTemplate(if_header=self.loop_header, if_body=self.if_body, else_body=self.else_body)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(if_else_template.to_indented_source(source_lines))
|
||||||
|
return "\n".join(["while True: # inserted", body])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
@@ -0,0 +1,232 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
import collections
|
||||||
|
|
||||||
|
from ..cfg_utils import ControlFlowEdgeType, get_dominator_function
|
||||||
|
from .natural.InstructionTemplate import InstructionTemplate
|
||||||
|
|
||||||
|
from .abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from .natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
from typing import Callable, Any
|
||||||
|
|
||||||
|
# common node/edge/mapping verification functions and factories
|
||||||
|
|
||||||
|
|
||||||
|
def assert_edge_type(*edge_types: ControlFlowEdgeType) -> Callable[[Any, Any, dict], bool]:
|
||||||
|
def initialized_assert_edge_type(graph_source, graph_dest, graph_edge_properties: dict) -> bool:
|
||||||
|
if graph_edge_properties is None:
|
||||||
|
return False
|
||||||
|
return graph_edge_properties.get("type", None) in [edge_type.value for edge_type in edge_types]
|
||||||
|
|
||||||
|
return initialized_assert_edge_type
|
||||||
|
|
||||||
|
|
||||||
|
def assert_node_type(*node_types: type) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_assert_node_type(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return any(isinstance(node, node_type) for node_type in node_types)
|
||||||
|
|
||||||
|
return initialized_assert_node_type
|
||||||
|
|
||||||
|
|
||||||
|
def assert_in_degree(in_degree: int) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_assert_in_degree(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
if node is None:
|
||||||
|
return False
|
||||||
|
return cfg.in_degree(node) == in_degree
|
||||||
|
|
||||||
|
return initialized_assert_in_degree
|
||||||
|
|
||||||
|
|
||||||
|
def assert_instruction_opname(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_assert_instruction_opname(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
if node is None:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if isinstance(node, LinearSequenceTemplate):
|
||||||
|
candidate = node.members[-1]
|
||||||
|
else:
|
||||||
|
candidate = node
|
||||||
|
|
||||||
|
if not isinstance(candidate, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
return candidate.instruction.opname in opnames
|
||||||
|
|
||||||
|
return initialized_assert_instruction_opname
|
||||||
|
|
||||||
|
|
||||||
|
def assert_first_instruction_opname(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_assert_first_instruction_opname(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
"""
|
||||||
|
if node is None:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if isinstance(node, LinearSequenceTemplate):
|
||||||
|
candidate = node.members[0]
|
||||||
|
else:
|
||||||
|
candidate = node
|
||||||
|
|
||||||
|
if not isinstance(candidate, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
return candidate.instruction.opname in opnames
|
||||||
|
"""
|
||||||
|
i = node.get_instructions()
|
||||||
|
return i and i[0].opname in opnames
|
||||||
|
|
||||||
|
return initialized_assert_first_instruction_opname
|
||||||
|
|
||||||
|
|
||||||
|
def assert_unconditional_jump(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
if not isinstance(node, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
return node.instruction.is_uncond_jump
|
||||||
|
|
||||||
|
|
||||||
|
def optional_node(cfg, node) -> bool:
|
||||||
|
# returns true even when node is None!
|
||||||
|
# overrides default behavior of checking if the node exists
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def optional_edge(graph_source, graph_dest, graph_edge_properties: dict) -> bool:
|
||||||
|
# returns true even when the edge is None!
|
||||||
|
# overrides default behavior of checking if the edge exists
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def edge_is_none_or_matches(verification_func: Callable[[nx.DiGraph, Any], bool]) -> Callable[[Any, Any, dict], bool]:
|
||||||
|
def initialized_edge_is_none_or_matches(graph_source, graph_dest, graph_edge_properties: dict) -> bool:
|
||||||
|
return graph_dest is None or verification_func(graph_source, graph_dest, graph_edge_properties)
|
||||||
|
|
||||||
|
return initialized_edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
def node_is_none_or_matches(verification_func: Callable[[nx.DiGraph, Any], bool]) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_node_is_none_or_matches(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return node is None or verification_func(cfg, node)
|
||||||
|
|
||||||
|
return initialized_node_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
def node_match_all(*verification_funcs: Callable[[nx.DiGraph, Any], bool]) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_match_all(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return all(f(cfg, node) for f in verification_funcs)
|
||||||
|
|
||||||
|
return initialized_match_all
|
||||||
|
|
||||||
|
|
||||||
|
def node_match_none(*verification_funcs: Callable[[nx.DiGraph, Any], bool]) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_match_all(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return not any(f(cfg, node) for f in verification_funcs)
|
||||||
|
|
||||||
|
return initialized_match_all
|
||||||
|
|
||||||
|
|
||||||
|
def node_match_any(*verification_funcs: Callable[[nx.DiGraph, Any], bool]) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_match_any(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
return any(f(cfg, node) for f in verification_funcs)
|
||||||
|
|
||||||
|
return initialized_match_any
|
||||||
|
|
||||||
|
|
||||||
|
def assert_no_linestarts(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
return not any(inst.starts_line for inst in node.get_instructions())
|
||||||
|
|
||||||
|
|
||||||
|
def contains_opname_sequence(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_contains_opname_sequence(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
for window in sliding_window(node.get_instructions(), n=len(opnames)):
|
||||||
|
if tuple(inst.opname for inst in window) == opnames:
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
return initialized_contains_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
def starts_with_opname_sequence(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_starts_with_opname_sequence(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
i = node.get_instructions()
|
||||||
|
return len(i) >= len(opnames) and tuple(x.opname for x in i[: len(opnames)]) == opnames
|
||||||
|
|
||||||
|
return initialized_starts_with_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
def ends_with_opname_sequence(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_ends_with_opname_sequence(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
i = node.get_instructions()
|
||||||
|
return len(i) >= len(opnames) and tuple(x.opname for x in i[-len(opnames) :]) == opnames
|
||||||
|
|
||||||
|
return initialized_ends_with_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
def is_exactly_opname(*opnames: str) -> Callable[[nx.DiGraph, Any], bool]:
|
||||||
|
def initialized_is_exactly_opname(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
return isinstance(node, ControlFlowTemplate) and tuple(x.opname for x in node.get_instructions()) == opnames
|
||||||
|
|
||||||
|
return initialized_is_exactly_opname
|
||||||
|
|
||||||
|
|
||||||
|
def assert_node_has_no_backwards_edges(cfg, node) -> bool:
|
||||||
|
dominates = get_dominator_function(cfg)
|
||||||
|
return not any(dominates(successor, node) for successor in cfg.successors(node))
|
||||||
|
|
||||||
|
|
||||||
|
def assert_except_as(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
# specialized node verification function for the header
|
||||||
|
# the header must be a LinearSequence where the last instruction is JUMP_IF_NOT_EXC_MATCH
|
||||||
|
# this instruction is used *exclusively* for except-as constructions in pre-3.11 bytecode
|
||||||
|
# this rule only applies to versions 3.9 and 3.10
|
||||||
|
if not isinstance(node, LinearSequenceTemplate):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# version 3.9-3.10
|
||||||
|
exc_match_member = node.members[-1]
|
||||||
|
if not isinstance(exc_match_member, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
if exc_match_member.instruction.opname == "JUMP_IF_NOT_EXC_MATCH":
|
||||||
|
return True
|
||||||
|
|
||||||
|
# so we dont throw errors
|
||||||
|
|
||||||
|
if len(node.members) < 2:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# pre-3.9
|
||||||
|
exc_match_member = node.members[-2]
|
||||||
|
if not isinstance(exc_match_member, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
if exc_match_member.instruction.opname == "COMPARE_OP" and exc_match_member.instruction.argval == "exception-match":
|
||||||
|
return True
|
||||||
|
|
||||||
|
# 3.11
|
||||||
|
if exc_match_member.instruction.opname == "CHECK_EXC_MATCH":
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def assert_with(cfg: nx.DiGraph, node: ControlFlowTemplate) -> bool:
|
||||||
|
# these statements begin in a linear sequence template
|
||||||
|
# so if the node is not in a linear sequence template then this is not
|
||||||
|
# a with statement
|
||||||
|
|
||||||
|
if not isinstance(node, LinearSequenceTemplate):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# designed for version 3.8 might be different for other versions
|
||||||
|
with_match_member = node.members[-1] # get the last element that should be a SETUP_WITH
|
||||||
|
if not isinstance(with_match_member, InstructionTemplate):
|
||||||
|
return False
|
||||||
|
if with_match_member.instruction.opname in ("SETUP_WITH", "SETUP_ASYNC_WITH", "BEFORE_WITH", "END_SEND"):
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
# iteration helper
|
||||||
|
def sliding_window(iterable, n):
|
||||||
|
"Collect data into overlapping fixed-length chunks or blocks."
|
||||||
|
# sliding_window('ABCDEFG', 4) → ABCD BCDE CDEF DEFG
|
||||||
|
it = iter(iterable)
|
||||||
|
window = collections.deque(itertools.islice(it, n - 1), maxlen=n)
|
||||||
|
for x in it:
|
||||||
|
window.append(x)
|
||||||
|
yield tuple(window)
|
||||||
+69
@@ -0,0 +1,69 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode import Inst
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class InstructionTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A thin wrapper around the Inst class to support formatting source code
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, instruction: Inst):
|
||||||
|
self.instruction = instruction
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if not isinstance(node, Inst):
|
||||||
|
return None
|
||||||
|
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
inst_template = InstructionTemplate(node)
|
||||||
|
return nx.relabel_nodes(cfg, mapping={node: inst_template}, copy=True)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def match_graph(cfg: nx.DiGraph) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the whole graph
|
||||||
|
Returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
"""
|
||||||
|
node_mapping = dict()
|
||||||
|
for node in cfg.nodes:
|
||||||
|
if not isinstance(node, Inst):
|
||||||
|
continue
|
||||||
|
|
||||||
|
inst_template = InstructionTemplate(node)
|
||||||
|
node_mapping[node] = inst_template
|
||||||
|
|
||||||
|
return nx.relabel_nodes(cfg, mapping=node_mapping, copy=True)
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
if not self.instruction.starts_line:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
line = source_lines[self.instruction.starts_line - 1].strip()
|
||||||
|
if line.startswith("elif "):
|
||||||
|
line = line[2:]
|
||||||
|
elif line in ("break", "continue", "except:", "try:"):
|
||||||
|
line = ""
|
||||||
|
|
||||||
|
return line
|
||||||
|
|
||||||
|
def get_instructions(self) -> list[Inst]:
|
||||||
|
return [self.instruction]
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
if self.instruction.starts_line:
|
||||||
|
return f"({self.instruction.starts_line}) <{self.instruction.get_dis_view()}>"
|
||||||
|
return f"<{self.instruction.get_dis_view()}>"
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode import Inst
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class LineTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A natural progression of control flow templates with the same exception handler.
|
||||||
|
No conditional jumps are allowed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, *members: ControlFlowTemplate):
|
||||||
|
self.members = members
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
raise NotImplementedError("LineTemplates do not have local matching logic.")
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return "\n".join(member.to_indented_source(source_lines) for member in self.members)
|
||||||
|
|
||||||
|
def get_instructions(self) -> list[Inst]:
|
||||||
|
insts: list[Inst] = []
|
||||||
|
for member in self.members:
|
||||||
|
insts.extend(member.get_instructions())
|
||||||
|
return insts
|
||||||
|
return sorted(insts, key=lambda i: i.offset)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
name = f"{type(self).__name__}"
|
||||||
|
components = ControlFlowTemplate._indent_multiline_string("\n".join(repr(member) for member in self.members))
|
||||||
|
return f"{name}[\n{components}]"
|
||||||
+127
@@ -0,0 +1,127 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from pylingual.editable_bytecode import Inst
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
# imports for our exception whitelist so we do not have to absorb any tails and affect
|
||||||
|
# control flow in the future (hopefully fingers crossed)
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..loop.PreRefinedLoopTemplate import PreRefinedLoopTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import get_out_edge_dict, ControlFlowEdgeType, get_dominator_function
|
||||||
|
|
||||||
|
|
||||||
|
class LinearSequenceTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A natural progression of control flow templates with the same exception handler.
|
||||||
|
No conditional jumps are allowed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, *members: ControlFlowTemplate):
|
||||||
|
self.members = members
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes or not isinstance(node, ControlFlowTemplate):
|
||||||
|
return None
|
||||||
|
|
||||||
|
dominates = get_dominator_function(cfg)
|
||||||
|
|
||||||
|
def is_back_edge(src, dst):
|
||||||
|
return dominates(dst, src)
|
||||||
|
|
||||||
|
base_edge_dict = get_out_edge_dict(cfg, node)
|
||||||
|
base_exception_handler = base_edge_dict["exception"]
|
||||||
|
|
||||||
|
# validate that the current node is able to start a linear sequence
|
||||||
|
# jumps cannot start a linear sequence
|
||||||
|
if base_edge_dict["conditional"][0] or (base_edge_dict["natural"][1] and base_edge_dict["natural"][1]["type"] != ControlFlowEdgeType.NATURAL.value):
|
||||||
|
return None
|
||||||
|
# back edges cannot start a linear sequence
|
||||||
|
if any(is_back_edge(*edge) for edge in cfg.out_edges(node)):
|
||||||
|
return None
|
||||||
|
|
||||||
|
matched_sequence = [node]
|
||||||
|
current_edge_dict = base_edge_dict
|
||||||
|
|
||||||
|
# while there is a natural progression, try to extend the linear sequence
|
||||||
|
while (next_node_and_edge_properties := current_edge_dict["natural"])[0]:
|
||||||
|
next_node, _ = next_node_and_edge_properties
|
||||||
|
next_edge_dict = get_out_edge_dict(cfg, next_node)
|
||||||
|
|
||||||
|
# all elements of a linear sequence must have the same exception handler
|
||||||
|
if next_edge_dict["exception"] != base_exception_handler:
|
||||||
|
break
|
||||||
|
|
||||||
|
# only the natural incoming edge from the previous node is allowed in linear sequences
|
||||||
|
if cfg.in_degree(nbunch=next_node) > 1:
|
||||||
|
break
|
||||||
|
|
||||||
|
# do not extend after an END_FINALLY
|
||||||
|
if isinstance(matched_sequence[-1], ControlFlowTemplate) and not isinstance(matched_sequence[-1], AbstractNonSequentiable):
|
||||||
|
insts = matched_sequence[-1].get_instructions()
|
||||||
|
if insts and insts[-1].opname == "END_FINALLY":
|
||||||
|
break
|
||||||
|
|
||||||
|
# do not merge in prerefined loop templates; they still need to be refined
|
||||||
|
if isinstance(next_node, PreRefinedLoopTemplate):
|
||||||
|
break
|
||||||
|
|
||||||
|
# conditional jumps are only allowed in the last element of a linear sequence
|
||||||
|
if current_edge_dict["conditional"][0] and current_edge_dict["natural"][0]:
|
||||||
|
break
|
||||||
|
|
||||||
|
# absolute jumps are only allowed in the last element of a linear sequence
|
||||||
|
if current_edge_dict["natural"][1] and current_edge_dict["natural"][1]["type"] != ControlFlowEdgeType.NATURAL.value:
|
||||||
|
break
|
||||||
|
|
||||||
|
matched_sequence.append(next_node)
|
||||||
|
current_edge_dict = next_edge_dict
|
||||||
|
|
||||||
|
# if we didn't reduce the graph size, match failed
|
||||||
|
if len(matched_sequence) < 2:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# unpack nested LinearSequenceTemplates for improved readability of the parse tree
|
||||||
|
unpacked_matched_sequence = []
|
||||||
|
for match_item in matched_sequence:
|
||||||
|
if isinstance(match_item, LinearSequenceTemplate):
|
||||||
|
unpacked_matched_sequence.extend(match_item.members)
|
||||||
|
else:
|
||||||
|
unpacked_matched_sequence.append(match_item)
|
||||||
|
# preserve the incoming edges from the first node and the outgoing edges from the last node
|
||||||
|
linear_sequence_template = LinearSequenceTemplate(*unpacked_matched_sequence)
|
||||||
|
in_edges = ((linear_sequence_template if src in matched_sequence else src, linear_sequence_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = ((linear_sequence_template, linear_sequence_template if dst in matched_sequence else dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=matched_sequence[-1], data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(matched_sequence)
|
||||||
|
reduced_cfg.add_node(linear_sequence_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return "\n".join(member.to_indented_source(source_lines) for member in self.members)
|
||||||
|
|
||||||
|
def get_instructions(self) -> list[Inst]:
|
||||||
|
insts: list[Inst] = []
|
||||||
|
for member in self.members:
|
||||||
|
insts.extend(member.get_instructions())
|
||||||
|
return insts
|
||||||
|
return sorted(insts, key=lambda i: i.offset)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
name = f"{type(self).__name__}"
|
||||||
|
components = ControlFlowTemplate._indent_multiline_string("\n".join(repr(member) for member in self.members))
|
||||||
|
return f"{name}[\n{components}]"
|
||||||
+30
@@ -0,0 +1,30 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptPlaceholderTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
Placeholder for except; used in ExceptAs.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, body: ControlFlowTemplate):
|
||||||
|
self.body = body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
raise NotImplementedError("ExceptPlaceholderTemplate does not have local matching logic. These are created in ExceptAs")
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.body.to_indented_source(source_lines))
|
||||||
|
return f"except:\n{body}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__() if self.body else type(self).__name__
|
||||||
+9
@@ -0,0 +1,9 @@
|
|||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class IrreduciblePlaceholderTemplate(ControlFlowTemplate):
|
||||||
|
def __init__(self, msg):
|
||||||
|
self.msg = msg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
return f"pass # cflow: {self.msg}"
|
||||||
+42
@@ -0,0 +1,42 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
|
||||||
|
class WhileTruePlaceholderTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
Placeholder for While True; used in PreRefinedLoopTemplate
|
||||||
|
"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
raise NotImplementedError("WhileTruePlaceholderTemplate does not have local matching logic. These are created in PreRefinedLoopTemplate")
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return "while True: # inserted"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def structure_node_inplace(cfg: nx.DiGraph, loop_header, loop_successor):
|
||||||
|
# insert a WhileTruePlaceholderTemplate before the loop_header, and add a conditional edge to the loop successor
|
||||||
|
# this "looks like" a normal while loop, which allows structuring to continue
|
||||||
|
placeholder = WhileTruePlaceholderTemplate()
|
||||||
|
|
||||||
|
# replace the incoming edges
|
||||||
|
in_edges = [(src, placeholder, data) for src, _, data in cfg.in_edges(loop_header, data=True)]
|
||||||
|
cfg.add_edges_from(in_edges)
|
||||||
|
cfg.remove_edges_from(list(cfg.in_edges(loop_header)))
|
||||||
|
|
||||||
|
# add outgoing edges to the placeholder
|
||||||
|
cfg.add_edge(placeholder, loop_header, type=ControlFlowEdgeType.NATURAL.value)
|
||||||
|
if loop_successor:
|
||||||
|
cfg.add_edge(placeholder, loop_successor, type=ControlFlowEdgeType.FALSE_JUMP.value)
|
||||||
|
|
||||||
|
return placeholder
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return type(self).__name__
|
||||||
+130
@@ -0,0 +1,130 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, node_is_none_or_matches, edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
class ExitSubTemplate(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
|
||||||
|
A basic with template as a catch all for exits
|
||||||
|
first case
|
||||||
|
|
||||||
|
(1) or simply (1)
|
||||||
|
e/ \
|
||||||
|
(2)
|
||||||
|
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"exit_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="exit_header", dest="exit_flow", edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.NATURAL))),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exit_header",
|
||||||
|
dest="exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"exit_flow": TemplateNode(node_verification_func=node_is_none_or_matches(assert_in_degree(1)), exception_edge=TemplateEdge(source="exit_flow", dest="outer_exception_handler", edge_verification_func=optional_edge)),
|
||||||
|
"exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, exit_header: ControlFlowTemplate, exit_flow: ControlFlowTemplate):
|
||||||
|
self.exit_header = exit_header
|
||||||
|
self.exit_flow = exit_flow
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template or we are happy and return the base cfg.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=ExitSubTemplate._subgraph,
|
||||||
|
root_key="exit_header",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
# not standard; if we didn't match the exit, then continue matching the rest of the parent template
|
||||||
|
if not mapping:
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
# this is an appropriate match, but there is nothing to do
|
||||||
|
if not mapping["exit_flow"]:
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
exit_template = ExitSubTemplate(
|
||||||
|
exit_header=mapping["exit_header"],
|
||||||
|
exit_flow=mapping["exit_flow"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
|
||||||
|
if mapping["exception_handler"]:
|
||||||
|
out_edges.append((exit_template, mapping["exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([exit_template.exit_flow, exit_template.exit_header])
|
||||||
|
reduced_cfg.add_node(exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
header = self.exit_header.to_indented_source(source_lines)
|
||||||
|
exit_flow = self.exit_flow.to_indented_source(source_lines)
|
||||||
|
return "\n".join([header, exit_flow])
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+139
@@ -0,0 +1,139 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, node_match_all, node_match_any, contains_opname_sequence, edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsCleanupTemplate(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
The boilerplate cleanup at the end of an `except as` block.
|
||||||
|
The "happy cleanup" (1) is when there is no exception, and it jumps out to the next code segment.
|
||||||
|
The "angry cleanup" (2) is when there is an exception, and it reraises.
|
||||||
|
(0)
|
||||||
|
/ \\e --> (012)
|
||||||
|
(1) (2) |j
|
||||||
|
|j (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="happy_cleanup",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="angry_cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"happy_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
node_match_any(
|
||||||
|
contains_opname_sequence(
|
||||||
|
"LOAD_CONST",
|
||||||
|
"STORE_NAME",
|
||||||
|
"DELETE_NAME",
|
||||||
|
),
|
||||||
|
contains_opname_sequence(
|
||||||
|
"LOAD_CONST",
|
||||||
|
"STORE_FAST",
|
||||||
|
"DELETE_FAST",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="happy_cleanup", dest=None, edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.JUMP))),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="happy_cleanup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"angry_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
node_match_any(
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_NAME", "DELETE_NAME", "RERAISE"),
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_FAST", "DELETE_FAST", "RERAISE"),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="angry_cleanup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_body: ControlFlowTemplate, happy_cleanup: ControlFlowTemplate, angry_cleanup: ControlFlowTemplate):
|
||||||
|
self.except_body = except_body
|
||||||
|
self.happy_cleanup = happy_cleanup
|
||||||
|
self.angry_cleanup = angry_cleanup
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsCleanupTemplate._subgraph, root_key="except_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_as_cleanup_template = ExceptAsCleanupTemplate(except_body=mapping["except_body"], happy_cleanup=mapping.get("happy_cleanup", None), angry_cleanup=mapping.get("angry_cleanup", None))
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_cleanup_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
out_edges = [(except_as_cleanup_template, dst, data) for _, dst, data in cfg.out_edges([except_as_cleanup_template.happy_cleanup, except_as_cleanup_template.angry_cleanup], data=True)]
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_as_cleanup_template.except_body, except_as_cleanup_template.happy_cleanup, except_as_cleanup_template.angry_cleanup])
|
||||||
|
reduced_cfg.add_node(except_as_cleanup_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
# cleanup code is implicit! only report the body code
|
||||||
|
return self.except_body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+154
@@ -0,0 +1,154 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as
|
||||||
|
|
||||||
|
from ..placeholders.ExceptPlaceholderTemplate import ExceptPlaceholderTemplate
|
||||||
|
from .ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsExceptTemplate(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except as` block, after its cleanup has been structured.
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (012)
|
||||||
|
(1) (2) |j
|
||||||
|
\\j //j (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(source="except_as_header", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="after_except",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="after_except",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_body: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsExceptTemplate._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
non_match_path = mapping["non_match_path"]
|
||||||
|
if not isinstance(non_match_path, ExceptAsExceptTemplate) and not isinstance(non_match_path, ExceptAsTemplate):
|
||||||
|
non_match_path = ExceptPlaceholderTemplate(body=non_match_path)
|
||||||
|
|
||||||
|
except_as_cleanup_template = ExceptAsExceptTemplate(except_as_header=mapping["except_as_header"], except_body=mapping["except_body"], non_match_path=non_match_path)
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_cleanup_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["after_except"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_as_cleanup_template.except_as_header, except_as_cleanup_template.except_body, mapping["non_match_path"]])
|
||||||
|
reduced_cfg.add_node(except_as_cleanup_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines).rstrip()
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines)).rstrip()
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines).rstrip()
|
||||||
|
return f"{header}\n{body}\n{non_match}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+159
@@ -0,0 +1,159 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as, node_match_all, node_match_any, contains_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsExitTemplate(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except as` block, but with an exit statement.
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (01234)
|
||||||
|
(1) (2)
|
||||||
|
|
|
||||||
|
(3)
|
||||||
|
|e
|
||||||
|
(4)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_body_setup",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(source="except_as_header", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_body_setup": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body_setup", dest="except_body", edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body_setup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="except_as_cleanup",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_as_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
node_match_any(
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_NAME", "DELETE_NAME", "RERAISE"),
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_FAST", "DELETE_FAST", "RERAISE"),
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_NAME", "DELETE_NAME", "END_FINALLY"),
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_FAST", "DELETE_FAST", "END_FINALLY"),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_as_cleanup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_body_setup: ControlFlowTemplate, except_body: ControlFlowTemplate, except_as_cleanup: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_body_setup = except_body_setup
|
||||||
|
self.except_body = except_body
|
||||||
|
self.except_as_cleanup = except_as_cleanup
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsExitTemplate._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_as_exit_template = ExceptAsExitTemplate(
|
||||||
|
except_as_header=mapping["except_as_header"], except_body_setup=mapping["except_body_setup"], except_body=mapping["except_body"], except_as_cleanup=mapping["except_as_cleanup"], non_match_path=mapping["non_match_path"]
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_exit_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
out_edges = ((except_as_exit_template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(except_as_exit_template.non_match_path, data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(
|
||||||
|
[except_as_exit_template.except_as_header, except_as_exit_template.except_body_setup, except_as_exit_template.except_body, except_as_exit_template.except_as_cleanup, except_as_exit_template.non_match_path]
|
||||||
|
)
|
||||||
|
reduced_cfg.add_node(except_as_exit_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines) + self.except_body_setup.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
return f"{header}\n{body}\n{non_match}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+141
@@ -0,0 +1,141 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsTemplate(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except as` block, after its cleanup has been structured.
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (012)
|
||||||
|
(1) (2) |j
|
||||||
|
|j (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(source="except_as_header", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="after_except", edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_body: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsTemplate._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_as_cleanup_template = ExceptAsTemplate(except_as_header=mapping["except_as_header"], except_body=mapping["except_body"], non_match_path=mapping["non_match_path"])
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_cleanup_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["after_except"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_as_cleanup_template.except_as_header, except_as_cleanup_template.except_body, except_as_cleanup_template.non_match_path])
|
||||||
|
reduced_cfg.add_node(except_as_cleanup_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
return f"{header}\n{body}\n{non_match}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+120
@@ -0,0 +1,120 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptException(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except exception block' currently confirmed to work from 3.6 to 3.8
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (012)
|
||||||
|
(1) (2) |j
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(source="except_as_header", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
# natural_edge=TemplateEdge(
|
||||||
|
# source='except_body',
|
||||||
|
# dest=None, #might need a tail I am not too sure about any potential natural edges as of now
|
||||||
|
# edge_verification_func=optional_edge
|
||||||
|
# ),
|
||||||
|
exception_edge=TemplateEdge(source="except_body", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_body: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptException._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_exception_template = ExceptException(except_as_header=mapping["except_as_header"], except_body=mapping["except_body"], non_match_path=mapping["non_match_path"])
|
||||||
|
|
||||||
|
in_edges = ((src, except_exception_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((except_exception_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_exception_template.except_as_header, except_exception_template.except_body, except_exception_template.non_match_path])
|
||||||
|
reduced_cfg.add_node(except_exception_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
return f"{header}\n{body}\n{non_match}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+167
@@ -0,0 +1,167 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, assert_instruction_opname
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class FinallyTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
A basic `finally` block after the related try-excepts have been structured.
|
||||||
|
(0)
|
||||||
|
/ \\ --> (01234)
|
||||||
|
|e (2)
|
||||||
|
| /e|
|
||||||
|
(3) (4)
|
||||||
|
does not cover additional finally blocks that will be inserted in the bytecode as a result of returns / breaking out of loops
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="try_except",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="angry_finally",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_except": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_except",
|
||||||
|
dest="happy_finally",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_except",
|
||||||
|
dest="angry_finally",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"happy_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="happy_finally", dest="tail", edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="happy_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"angry_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(2),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="angry_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_finally: ControlFlowTemplate, try_except: ControlFlowTemplate, happy_finally: ControlFlowTemplate, angry_finally: ControlFlowTemplate):
|
||||||
|
self.setup_finally = setup_finally
|
||||||
|
self.try_except = try_except
|
||||||
|
self.happy_finally = happy_finally
|
||||||
|
self.angry_finally = angry_finally
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def verify_finally_match(cfg: nx.DiGraph, mapping: dict[str, ControlFlowTemplate]) -> bool:
|
||||||
|
# check to make sure that all non-stack/control instructions match between the two finally blocks
|
||||||
|
# this list was made for 3.9, so it may need to be expanded for other versions
|
||||||
|
stack_and_control_insts = {"POP_TOP", "POP_EXCEPT", "ROT_TWO", "ROT_THREE", "ROT_FOUR", "JUMP_FORWARD", "JUMP_BACKWARD", "JUMP_ABSOLUTE", "RERAISE"}
|
||||||
|
happy_insts = [(inst.opname, inst.arg) for inst in mapping["happy_finally"].get_instructions() if inst.opname not in stack_and_control_insts]
|
||||||
|
angry_insts = [(inst.opname, inst.arg) for inst in mapping["angry_finally"].get_instructions() if inst.opname not in stack_and_control_insts]
|
||||||
|
return happy_insts == angry_insts
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=FinallyTemplate._subgraph, root_key="setup_finally", mapping_verification_func=verify_finally_match)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
finally_template = FinallyTemplate(setup_finally=mapping["setup_finally"], try_except=mapping["try_except"], happy_finally=mapping["happy_finally"], angry_finally=mapping["angry_finally"])
|
||||||
|
|
||||||
|
in_edges = [(src, finally_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True)]
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the finally
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((finally_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["tail"]:
|
||||||
|
out_edges.append((finally_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([finally_template.setup_finally, finally_template.try_except, finally_template.happy_finally, finally_template.angry_finally])
|
||||||
|
reduced_cfg.add_node(finally_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_block = self.try_except.to_indented_source(source_lines)
|
||||||
|
# pick one of the finally bodies to get the source code from
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self.happy_finally.to_indented_source(source_lines))
|
||||||
|
if not finally_body:
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self.angry_finally.to_indented_source(source_lines))
|
||||||
|
finally_lines = [try_block, "finally: # inserted", finally_body]
|
||||||
|
return "\n".join(finally_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+70
@@ -0,0 +1,70 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_in_degree, node_match_all
|
||||||
|
|
||||||
|
|
||||||
|
def is_cleanup(cfg: nx.DiGraph, node) -> bool:
|
||||||
|
i = node.get_instructions()
|
||||||
|
return len(i) == 2 and i[0].opname == "CALL_INTRINSIC_1" and i[1].opname == "RERAISE"
|
||||||
|
|
||||||
|
|
||||||
|
class GeneratorCleanupTemplate(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"generator": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="generator",
|
||||||
|
dest="cleanup",
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_in_degree(1), is_cleanup),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, generator: ControlFlowTemplate, cleanup: ControlFlowTemplate):
|
||||||
|
self.generator = generator
|
||||||
|
self.cleanup = cleanup
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=GeneratorCleanupTemplate._subgraph, root_key="generator", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = GeneratorCleanupTemplate(generator=mapping["generator"], cleanup=mapping["cleanup"])
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in cfg.in_edges(nbunch=node, data=True))
|
||||||
|
out_edges = ((template, dst, edge_properties) for src, dst, edge_properties in cfg.out_edges(nbunch=node, data=True) if dst != mapping["cleanup"])
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([template.cleanup, template.generator])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
return self.generator.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+167
@@ -0,0 +1,167 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from .TryExceptTemplate import TryExceptTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class TryExceptElseTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
A `try-except` with an else and a structured except.
|
||||||
|
(0)
|
||||||
|
/ \\e --> (0123)
|
||||||
|
(1) (2) |
|
||||||
|
|j |j (4)
|
||||||
|
(3) |
|
||||||
|
\\ /
|
||||||
|
(4)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="try_footer",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_footer": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="try_footer", dest="else_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.JUMP)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"else_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="after_try_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="else_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="after_try_except",
|
||||||
|
edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.JUMP)),
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_body: ControlFlowTemplate, try_footer: ControlFlowTemplate, else_body: ControlFlowTemplate, except_body: ControlFlowTemplate):
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_footer = try_footer
|
||||||
|
self.else_body = else_body
|
||||||
|
self.except_body = except_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryExceptElseTemplate._subgraph, root_key="try_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try_except_template = TryExceptElseTemplate(try_body=mapping["try_body"], try_footer=mapping["try_footer"], else_body=mapping["else_body"], except_body=mapping["except_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the try except
|
||||||
|
out_edges = []
|
||||||
|
if mapping.get("after_try_except", None):
|
||||||
|
out_edges.append((try_except_template, mapping["after_try_except"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_body, try_except_template.try_footer, try_except_template.else_body, try_except_template.except_body])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_except_template = TryExceptTemplate(try_body=self.try_body, try_footer=self.try_footer, except_body=self.except_body)
|
||||||
|
try_except_lines = [try_except_template.to_indented_source(source_lines)]
|
||||||
|
else_body = ControlFlowTemplate._indent_multiline_string(self.else_body.to_indented_source(source_lines))
|
||||||
|
try_except_lines.extend(["else: # inserted", else_body])
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+188
@@ -0,0 +1,188 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ..abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
from ..loop.LoopExitTemplate import LoopExitTemplate
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import optional_node, optional_edge, assert_in_degree, node_match_all, assert_node_has_no_backwards_edges, node_is_none_or_matches
|
||||||
|
|
||||||
|
from .ExceptAsTemplate import ExceptAsTemplate
|
||||||
|
from .ExceptAsExceptTemplate import ExceptAsExceptTemplate
|
||||||
|
from ..subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class TryExceptTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
A `try-except` block with just a naked except.
|
||||||
|
(0)
|
||||||
|
/ \\e --> (012)
|
||||||
|
(1) (2) |
|
||||||
|
\\j /j (3)
|
||||||
|
(3)
|
||||||
|
One or more of the try/except may have no further control flow.
|
||||||
|
However, if both have successors, they must go to the same place.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="try_footer",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_footer": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_is_none_or_matches(
|
||||||
|
node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="try_footer", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_body: ControlFlowTemplate, try_footer: ControlFlowTemplate, except_body: ControlFlowTemplate):
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_footer = try_footer
|
||||||
|
self.except_body = except_body
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryExceptTemplate._subgraph, root_key="try_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# have to make sure there is a try_footer before trying to map it as it is an optional node (this is mostly here for 3.7
|
||||||
|
# since there are cases where there is not a try footer at all)
|
||||||
|
|
||||||
|
try_except_template = TryExceptTemplate(try_body=mapping["try_body"], try_footer=mapping.get("try_footer", None), except_body=mapping["except_body"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the try except
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
if "after_try_except" in mapping.keys():
|
||||||
|
after_try_except = mapping["after_try_except"]
|
||||||
|
out_edges.append((try_except_template, after_try_except, {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_body, try_except_template.try_footer, try_except_template.except_body])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_body = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
# check if there is a try footer as in 3.7 there may not be a try footer at all
|
||||||
|
if self.try_footer:
|
||||||
|
try_footer = ControlFlowTemplate._indent_multiline_string(self.try_footer.to_indented_source(source_lines))
|
||||||
|
else:
|
||||||
|
try_footer = ""
|
||||||
|
|
||||||
|
try_except_lines = ["try:", try_body, try_footer]
|
||||||
|
|
||||||
|
# if we matched against an "Except ... as" chain, then omit the inserted except: block
|
||||||
|
omit_except = False
|
||||||
|
if isinstance(self.except_body, AbstractExceptionBlockTemplate):
|
||||||
|
omit_except = True
|
||||||
|
elif isinstance(self.except_body, LoopExitTemplate):
|
||||||
|
if isinstance(self.except_body.tail, ExceptAsTemplate) or isinstance(self.except_body.tail, ExceptAsExceptTemplate):
|
||||||
|
omit_except = True
|
||||||
|
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
if not omit_except:
|
||||||
|
try_except_lines.append("except:")
|
||||||
|
except_body = ControlFlowTemplate._indent_multiline_string(except_body)
|
||||||
|
|
||||||
|
try_except_lines.append(except_body)
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+173
@@ -0,0 +1,173 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ..abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ..abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ...cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ..Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ..match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, assert_instruction_opname, edge_is_none_or_matches
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class TryFinallyTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
"""
|
||||||
|
A `try` block with only a `finally` following it.
|
||||||
|
(0)
|
||||||
|
|
|
||||||
|
(1)
|
||||||
|
/e\\ --> (0123)
|
||||||
|
(3) (2) |
|
||||||
|
|j (4)
|
||||||
|
(4)
|
||||||
|
does not cover additional finally blocks that will be inserted in the bytecode as a result of returns / breaking out of loops
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="try_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="setup_finally", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="happy_finally",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="angry_finally",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"happy_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="happy_finally", dest="tail", edge_verification_func=edge_is_none_or_matches(assert_edge_type(ControlFlowEdgeType.JUMP))),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="happy_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"angry_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="angry_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_finally: ControlFlowTemplate, try_body: ControlFlowTemplate, happy_finally: ControlFlowTemplate, angry_finally: ControlFlowTemplate):
|
||||||
|
self.setup_finally = setup_finally
|
||||||
|
self.try_body = try_body
|
||||||
|
self.happy_finally = happy_finally
|
||||||
|
self.angry_finally = angry_finally
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cfg.in_degree(node) != 1:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# to avoid being treated as a try-except, we actually need to greedily search up one layer
|
||||||
|
pred = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
def verify_finally_match(cfg: nx.DiGraph, mapping: dict[str, ControlFlowTemplate]) -> bool:
|
||||||
|
# check to make sure that all non-stack/control instructions match between the two finally blocks
|
||||||
|
# this list was made for 3.9, so it may need to be expanded for other versions
|
||||||
|
stack_and_control_insts = {"POP_TOP", "POP_EXCEPT", "ROT_TWO", "ROT_THREE", "ROT_FOUR", "JUMP_FORWARD", "JUMP_BACKWARD", "JUMP_ABSOLUTE", "RERAISE"}
|
||||||
|
happy_insts = [(inst.opname, inst.arg) for inst in mapping["happy_finally"].get_instructions() if inst.opname not in stack_and_control_insts]
|
||||||
|
angry_insts = [(inst.opname, inst.arg) for inst in mapping["angry_finally"].get_instructions() if inst.opname not in stack_and_control_insts]
|
||||||
|
return happy_insts == angry_insts
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryFinallyTemplate._subgraph, root_key="setup_finally", mapping_verification_func=verify_finally_match)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, pred)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
finally_template = TryFinallyTemplate(setup_finally=mapping["setup_finally"], try_body=mapping["try_body"], happy_finally=mapping["happy_finally"], angry_finally=mapping["angry_finally"])
|
||||||
|
|
||||||
|
in_edges = [(src, finally_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(finally_template.setup_finally, data=True)]
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the finally
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((finally_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["tail"]:
|
||||||
|
out_edges.append((finally_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([finally_template.setup_finally, finally_template.try_body, finally_template.happy_finally, finally_template.angry_finally])
|
||||||
|
reduced_cfg.add_node(finally_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
# sometimes the setup finally is included in a linear sequence, so we need to include that source
|
||||||
|
setup_finally = self.setup_finally.to_indented_source(source_lines)
|
||||||
|
try_block = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
# pick one of the finally bodies to get the source code from
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self.happy_finally.to_indented_source(source_lines))
|
||||||
|
if not finally_body:
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self.angry_finally.to_indented_source(source_lines))
|
||||||
|
finally_lines = [setup_finally, "try:", try_block, "finally: # inserted", finally_body]
|
||||||
|
return "\n".join(finally_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+159
@@ -0,0 +1,159 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, node_match_all, node_match_any, contains_opname_sequence
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsCleanupSubTemplate311(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
The boilerplate cleanup at the end of an `except as` block after 3.11.
|
||||||
|
The "happy cleanup" (3) is when there is no exception, and it jumps out to the next code segment (except footer in 3.11).
|
||||||
|
The "angry cleanup" (2) is when there is an exception, and it reraises.
|
||||||
|
(0)
|
||||||
|
| \\e
|
||||||
|
(1) |
|
||||||
|
/ |e | --> (012)
|
||||||
|
(3)(2) | | \\e
|
||||||
|
|e / (3) (4)
|
||||||
|
(4)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_header": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="happy_cleanup",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="angry_cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"happy_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
node_match_any(
|
||||||
|
contains_opname_sequence(
|
||||||
|
"LOAD_CONST",
|
||||||
|
"STORE_NAME",
|
||||||
|
"DELETE_NAME",
|
||||||
|
),
|
||||||
|
contains_opname_sequence(
|
||||||
|
"LOAD_CONST",
|
||||||
|
"STORE_FAST",
|
||||||
|
"DELETE_FAST",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="happy_cleanup", dest=None, edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="happy_cleanup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"angry_cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
node_match_any(
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_NAME", "DELETE_NAME", "RERAISE"),
|
||||||
|
contains_opname_sequence("LOAD_CONST", "STORE_FAST", "DELETE_FAST", "RERAISE"),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="angry_cleanup",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_header: ControlFlowTemplate, except_body: ControlFlowTemplate, angry_cleanup: ControlFlowTemplate):
|
||||||
|
self.except_header = except_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.angry_cleanup = angry_cleanup
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsCleanupSubTemplate311._subgraph, root_key="except_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return cfg # if we didn't match the subtemplate, keep trying with the main template
|
||||||
|
|
||||||
|
except_as_cleanup_template = ExceptAsCleanupSubTemplate311(except_header=mapping["except_header"], except_body=mapping["except_body"], angry_cleanup=mapping["angry_cleanup"])
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_cleanup_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["happy_cleanup"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["happy_cleanup"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
if mapping["panic_except"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["panic_except"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_as_cleanup_template.except_header, except_as_cleanup_template.except_body, except_as_cleanup_template.angry_cleanup])
|
||||||
|
reduced_cfg.add_node(except_as_cleanup_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
# cleanup code is implicit! only report the body code
|
||||||
|
return self.except_body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+228
@@ -0,0 +1,228 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, node_is_none_or_matches, assert_instruction_opname, assert_node_type
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsNonMatchSubTemplate311(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
The non-match path of an except-as, which can be:
|
||||||
|
1. a standalone reraise (end of an except as chain)
|
||||||
|
2. an except block, which may exit
|
||||||
|
3. a structured except-as
|
||||||
|
"""
|
||||||
|
|
||||||
|
_reraise_subgraph = {
|
||||||
|
"reraise": TemplateNode(node_verification_func=assert_instruction_opname("RERAISE"), exception_edge=TemplateEdge(source="reraise", dest="panic_except")),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
_except_subgraph = {
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="except_footer",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_footer": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_is_none_or_matches(assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="after_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
_structered_except_as_subgraph = {
|
||||||
|
"except_as": TemplateNode(node_verification_func=assert_node_type(AbstractNonSequentiable), natural_edge=TemplateEdge(source="except_as", dest=None), exception_edge=TemplateEdge(source="except_as", dest="panic_except")),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_body: ControlFlowTemplate, except_footer: ControlFlowTemplate):
|
||||||
|
self.except_body = except_body
|
||||||
|
self.except_footer = except_footer
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# start by trying to match reraise
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsNonMatchSubTemplate311._reraise_subgraph, root_key="reraise", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if mapping:
|
||||||
|
# single-node subgraph does not need to by updated
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
# didn't match reraise; try to match structured except as
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsNonMatchSubTemplate311._structered_except_as_subgraph, root_key="except_as", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if mapping:
|
||||||
|
# single-node subgraph does not need to by updated
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
# didn't match structured except as; try to match except block
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsNonMatchSubTemplate311._except_subgraph, root_key="except_body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_template = ExceptAsNonMatchSubTemplate311(except_body=mapping["except_body"], except_footer=mapping["except_footer"])
|
||||||
|
|
||||||
|
in_edges = ((src, except_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["panic_except"]:
|
||||||
|
out_edges.append((except_template, mapping["panic_except"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping["after_except"]:
|
||||||
|
out_edges.append((except_template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_node(except_template.except_body)
|
||||||
|
if except_template.except_footer:
|
||||||
|
reduced_cfg.remove_node(except_template.except_footer)
|
||||||
|
reduced_cfg.add_node(except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
|
||||||
|
except_lines = ["except:"]
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
except_lines.append(body)
|
||||||
|
footer = ControlFlowTemplate._indent_multiline_string(self.except_footer.to_indented_source(source_lines))
|
||||||
|
except_lines.append(footer)
|
||||||
|
|
||||||
|
return "\n".join(except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+196
@@ -0,0 +1,196 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ...subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
from .ExceptAsNonMatchSubtemplate311 import ExceptAsNonMatchSubTemplate311
|
||||||
|
from .ExceptAsCleanupSubTemplate311 import ExceptAsCleanupSubTemplate311
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptAsTemplate311(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except as` block, after its cleanup has been structured.
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (0123)
|
||||||
|
(1) (2) |j
|
||||||
|
| (4)
|
||||||
|
(3)
|
||||||
|
|j
|
||||||
|
(4)
|
||||||
|
|
||||||
|
0,1,2 all have an exception edge to the panic cleanup from the current try block
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExceptAsCleanupSubTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="except_footer",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
subtemplate=ExceptAsNonMatchSubTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
source="non_match_path",
|
||||||
|
dest="after_except",
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_footer": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="after_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_body: ControlFlowTemplate, except_footer: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.except_footer = except_footer
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptAsTemplate311._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_as_template = ExceptAsTemplate311(except_as_header=mapping["except_as_header"], except_body=mapping["except_body"], except_footer=mapping["except_footer"], non_match_path=mapping["non_match_path"])
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["panic_except"]:
|
||||||
|
out_edges.append((except_as_template, mapping["panic_except"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping.get("after_except", None):
|
||||||
|
out_edges.append((except_as_template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([except_as_template.except_as_header, except_as_template.except_body, except_as_template.except_footer, except_as_template.non_match_path])
|
||||||
|
reduced_cfg.add_node(except_as_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
except_lines = []
|
||||||
|
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines)
|
||||||
|
except_lines.append(header)
|
||||||
|
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
except_lines.append(body)
|
||||||
|
|
||||||
|
footer = ControlFlowTemplate._indent_multiline_string(self.except_footer.to_indented_source(source_lines))
|
||||||
|
except_lines.append(footer)
|
||||||
|
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
except_lines.append(non_match)
|
||||||
|
|
||||||
|
return "\n".join(except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+430
@@ -0,0 +1,430 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
|
||||||
|
from ...natural.InstructionTemplate import InstructionTemplate
|
||||||
|
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import (
|
||||||
|
optional_node,
|
||||||
|
optional_edge,
|
||||||
|
assert_in_degree,
|
||||||
|
node_match_all,
|
||||||
|
assert_first_instruction_opname,
|
||||||
|
ends_with_opname_sequence,
|
||||||
|
is_exactly_opname,
|
||||||
|
node_match_any,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptTemplate311(ControlFlowTemplate):
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if isinstance(node, InstructionTemplate) and node.instruction.opname == "RERAISE":
|
||||||
|
return cfg
|
||||||
|
if isinstance(node, ExceptETemplate311):
|
||||||
|
return cfg
|
||||||
|
new_cfg = ExceptETemplate311.try_to_match_node(cfg, node)
|
||||||
|
if new_cfg is not None:
|
||||||
|
return new_cfg
|
||||||
|
new_cfg = BareExcept311.try_to_match_node(cfg, node)
|
||||||
|
if new_cfg is not None:
|
||||||
|
return new_cfg
|
||||||
|
|
||||||
|
|
||||||
|
class Footer(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"swap": TemplateNode(node_verification_func=node_match_all(is_exactly_opname("SWAP"), assert_in_degree(1)), natural_edge=TemplateEdge(source="swap", dest="footer"), exception_edge=TemplateEdge(source="swap", dest="panic")),
|
||||||
|
"footer": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
conditional_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"panic": TemplateNode(node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"), exception_edge=TemplateEdge(source="panic", dest=None, edge_verification_func=optional_edge)),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, footer):
|
||||||
|
self.footer = footer
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=Footer._subgraph, root_key="swap", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
template = Footer(mapping["footer"])
|
||||||
|
|
||||||
|
edges = [(next(cfg.predecessors(node)), template, {"type": ControlFlowEdgeType.NATURAL.value})]
|
||||||
|
edges.extend((template, dst, prop) for src, dst, prop in cfg.out_edges(mapping["footer"], data=True))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from((node, mapping["footer"]))
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(edges)
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines):
|
||||||
|
return self.footer.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptBody(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"store": TemplateNode(
|
||||||
|
node_verification_func=node_match_any(is_exactly_opname("STORE_FAST"), is_exactly_opname("STORE_NAME")),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="store",
|
||||||
|
dest="body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="store",
|
||||||
|
dest="panic",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest="footer",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
node_verification_func=node_match_any(
|
||||||
|
is_exactly_opname("LOAD_CONST", "STORE_FAST", "DELETE_FAST", "RERAISE"),
|
||||||
|
is_exactly_opname("LOAD_CONST", "STORE_NAME", "DELETE_NAME", "RERAISE"),
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="cleanup", dest="panic"),
|
||||||
|
),
|
||||||
|
"panic": TemplateNode(exception_edge=TemplateEdge(source="panic", dest=None, edge_verification_func=optional_edge)),
|
||||||
|
"footer": TemplateNode(
|
||||||
|
subtemplate=Footer,
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
conditional_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(source="footer", dest=None, edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptBody._subgraph, root_key="store", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
header = next(cfg.predecessors(node))
|
||||||
|
footer = mapping.get("footer")
|
||||||
|
|
||||||
|
template = ExceptBody(mapping["body"])
|
||||||
|
edges = [(header, template, {"type": ControlFlowEdgeType.NATURAL.value}), (template, mapping["panic"], {"type": ControlFlowEdgeType.EXCEPTION.value})]
|
||||||
|
if footer:
|
||||||
|
edges.append((template, footer, {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([mapping["store"], template.body, mapping["cleanup"]])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(edges)
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def __init__(self, body):
|
||||||
|
self.body = body
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines):
|
||||||
|
return self.body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
|
||||||
|
class BareExcept311(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="body", dest="footer"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="body",
|
||||||
|
dest="panic",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"footer": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_first_instruction_opname("POP_EXCEPT"), assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="footer",
|
||||||
|
dest="after_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic": TemplateNode(node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"), exception_edge=TemplateEdge(source="panic", dest="outer_exception_handler", edge_verification_func=optional_edge)),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=BareExcept311._subgraph, root_key="body", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = BareExcept311(
|
||||||
|
body=mapping["body"],
|
||||||
|
footer=mapping["footer"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["panic"]:
|
||||||
|
out_edges.append((template, mapping["panic"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping.get("after_except", None):
|
||||||
|
out_edges.append((template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([template.body, template.footer])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def __init__(self, body, footer):
|
||||||
|
self.body = body
|
||||||
|
self.footer = footer
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines):
|
||||||
|
return "\n".join(["except:", self._indent_multiline_string(self.body.to_indented_source(source_lines)), self._indent_multiline_string(self.footer.to_indented_source(source_lines))])
|
||||||
|
|
||||||
|
|
||||||
|
class ExceptETemplate311(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"except_header": TemplateNode(
|
||||||
|
node_verification_func=node_match_any(
|
||||||
|
ends_with_opname_sequence("CHECK_EXC_MATCH", "POP_JUMP_FORWARD_IF_FALSE"),
|
||||||
|
ends_with_opname_sequence("CHECK_EXC_MATCH", "POP_JUMP_IF_FALSE"),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_header",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_header",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExceptBody,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="except_footer", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
subtemplate=ExceptTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="panic_except",
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
source="non_match_path",
|
||||||
|
dest="after_except",
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_footer": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(assert_first_instruction_opname("POP_EXCEPT"), assert_in_degree(1)),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="after_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="panic_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_header: ControlFlowTemplate, except_body: ControlFlowTemplate, except_footer: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_header = except_header
|
||||||
|
self.except_body = except_body
|
||||||
|
self.except_footer = except_footer
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=ExceptETemplate311._subgraph, root_key="except_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
template = ExceptETemplate311(except_header=mapping["except_header"], except_body=mapping["except_body"], except_footer=mapping.get("except_footer"), non_match_path=mapping["non_match_path"])
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["panic_except"]:
|
||||||
|
out_edges.append((template, mapping["panic_except"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
if mapping.get("after_except", None):
|
||||||
|
out_edges.append((template, mapping["after_except"], {"type": ControlFlowEdgeType.JUMP.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([template.except_header, template.except_body, template.non_match_path])
|
||||||
|
if template.except_footer:
|
||||||
|
reduced_cfg.remove_node(template.except_footer)
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
except_lines = []
|
||||||
|
|
||||||
|
header = self.except_header.to_indented_source(source_lines)
|
||||||
|
except_lines.append(header)
|
||||||
|
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(self.except_body.to_indented_source(source_lines))
|
||||||
|
except_lines.append(body)
|
||||||
|
|
||||||
|
if self.except_footer:
|
||||||
|
footer = ControlFlowTemplate._indent_multiline_string(self.except_footer.to_indented_source(source_lines))
|
||||||
|
except_lines.append(footer)
|
||||||
|
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
except_lines.append(non_match)
|
||||||
|
|
||||||
|
return "\n".join(except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+243
@@ -0,0 +1,243 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...natural.InstructionTemplate import InstructionTemplate
|
||||||
|
from ...natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
from ...if_then.IfThenTemplate import IfThenTemplate
|
||||||
|
from ...if_then.IfElseTemplate import IfElseTemplate
|
||||||
|
from .TryTemplate311 import TryTemplate311
|
||||||
|
from .TryTemplate312 import TryTemplate312
|
||||||
|
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import (
|
||||||
|
assert_edge_type,
|
||||||
|
optional_node,
|
||||||
|
optional_edge,
|
||||||
|
assert_in_degree,
|
||||||
|
node_match_all,
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_instruction_opname,
|
||||||
|
is_exactly_opname,
|
||||||
|
assert_node_type,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class FinallyTemplate312(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"try_header": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("NOP"),
|
||||||
|
natural_edge=TemplateEdge(source="try_header", dest="try_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="try_body", dest="finally_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="fail",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"finally_body": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="finally_body", dest=None, edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="finally_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"fail": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
_subgraph2 = {
|
||||||
|
"try_except": TemplateNode(
|
||||||
|
node_verification_func=assert_node_type(TryTemplate311, TryTemplate312),
|
||||||
|
natural_edge=TemplateEdge(source="try_except", dest="finally_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_except",
|
||||||
|
dest="fail",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"finally_body": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="finally_body", dest=None, edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="finally_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"fail": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_header: ControlFlowTemplate, try_body: ControlFlowTemplate, finally_body: ControlFlowTemplate, fail: ControlFlowTemplate, panic_except: ControlFlowTemplate, cutoff):
|
||||||
|
self.try_header = try_header
|
||||||
|
self.try_body = try_body
|
||||||
|
self.finally_body = finally_body
|
||||||
|
self.fail = fail
|
||||||
|
self.panic_except = panic_except
|
||||||
|
self.cutoff = cutoff
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def mapping_verification_func(cfg, mapping):
|
||||||
|
finally_body = mapping["finally_body"]
|
||||||
|
fail = mapping["fail"]
|
||||||
|
if any(x.starts_line is not None for x in fail.get_instructions()):
|
||||||
|
return False
|
||||||
|
if not isinstance(finally_body, LinearSequenceTemplate):
|
||||||
|
finally_body = LinearSequenceTemplate(finally_body)
|
||||||
|
if not isinstance(fail, LinearSequenceTemplate):
|
||||||
|
fail = LinearSequenceTemplate(fail)
|
||||||
|
if isinstance(fail.members[0], InstructionTemplate) and fail.members[0].instruction.opname == "PUSH_EXC_INFO":
|
||||||
|
fail.members = fail.members[1:]
|
||||||
|
if isinstance(fail.members[-1], InstructionTemplate) and fail.members[-1].instruction.opname == "RERAISE":
|
||||||
|
fail.members = fail.members[:-1]
|
||||||
|
for x, y in zip(finally_body.members, fail.members):
|
||||||
|
if type(x) is not type(y) and not all(type(a) in [IfThenTemplate, IfElseTemplate] for a in (x, y)):
|
||||||
|
return False
|
||||||
|
mapping["cutoff"] = x
|
||||||
|
return True
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=FinallyTemplate312._subgraph, root_key="try_header", mapping_verification_func=FinallyTemplate312.mapping_verification_func)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=FinallyTemplate312._subgraph2, root_key="try_except", mapping_verification_func=FinallyTemplate312.mapping_verification_func)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
mapping["try_header"] = None
|
||||||
|
mapping["try_body"] = mapping["try_except"]
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
|
||||||
|
# "bite off" the NOP from a linear sequence template
|
||||||
|
if isinstance(mapping["try_header"], LinearSequenceTemplate):
|
||||||
|
# grab the nop and update the linear sequence
|
||||||
|
nop_inst_template = mapping["try_header"].members[-1]
|
||||||
|
mapping["try_header"].members = mapping["try_header"].members[:-1]
|
||||||
|
if len(mapping["try_header"].members) == 1:
|
||||||
|
nx.relabel_nodes(reduced_cfg, {mapping["try_header"]: mapping["try_header"].members[0]}, copy=False)
|
||||||
|
mapping["try_header"] = mapping["try_header"].members[0]
|
||||||
|
|
||||||
|
# transfer outgoing edges to the bitten off chunk
|
||||||
|
header_out_edges = list(reduced_cfg.out_edges(mapping["try_header"], data=True))
|
||||||
|
reduced_cfg.add_node(nop_inst_template)
|
||||||
|
reduced_cfg.remove_edges_from(header_out_edges)
|
||||||
|
reduced_cfg.add_edges_from((nop_inst_template, dst, data) for src, dst, data in header_out_edges)
|
||||||
|
reduced_cfg.add_edge(mapping["try_header"], nop_inst_template, type=ControlFlowEdgeType.NATURAL.value)
|
||||||
|
mapping["try_header"] = nop_inst_template
|
||||||
|
|
||||||
|
template = FinallyTemplate312(try_header=mapping["try_header"], try_body=mapping["try_body"], finally_body=mapping["finally_body"], fail=mapping["fail"], panic_except=mapping["panic_except"], cutoff=mapping["cutoff"])
|
||||||
|
|
||||||
|
in_edges = ((src, template, edge_properties) for src, dst, edge_properties in reduced_cfg.in_edges(template.try_header or template.try_body, data=True))
|
||||||
|
out_edges = [(template, dst, edge_properties) for src, dst, edge_properties in reduced_cfg.out_edges(template.finally_body, data=True)]
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg.remove_nodes_from([template.try_header, template.try_body, template.finally_body, template.fail, template.panic_except])
|
||||||
|
reduced_cfg.add_node(template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
try_header = self.try_header.to_indented_source(source_lines) if self.try_header else ""
|
||||||
|
try_body = self._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
if isinstance(self.finally_body, LinearSequenceTemplate):
|
||||||
|
i = self.finally_body.members.index(self.cutoff) + 1
|
||||||
|
in_finally = self._indent_multiline_string(LinearSequenceTemplate(*self.finally_body.members[:i]).to_indented_source(source_lines))
|
||||||
|
after = LinearSequenceTemplate(*self.finally_body.members[i:]).to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
in_finally = self._indent_multiline_string(self.finally_body.to_indented_source(source_lines))
|
||||||
|
after = ""
|
||||||
|
|
||||||
|
lines = [try_header, "try:", try_body, "finally: # inserted", in_finally, after]
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+213
@@ -0,0 +1,213 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
from ...natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
from ...loop.LoopExitTemplate import LoopExitTemplate
|
||||||
|
|
||||||
|
from .ExceptAsNonMatchSubtemplate311 import ExceptAsNonMatchSubTemplate311
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import assert_edge_type, optional_node, optional_edge, assert_in_degree, node_match_all, assert_node_has_no_backwards_edges, assert_instruction_opname
|
||||||
|
from ...subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class TryExceptTemplate311(ControlFlowTemplate):
|
||||||
|
"""
|
||||||
|
A `try-except` block with just a naked except in Python 3.11+.
|
||||||
|
(-1)
|
||||||
|
| (-1)
|
||||||
|
(0) |
|
||||||
|
/ \\e --> (01235)
|
||||||
|
(1) (2) |
|
||||||
|
| | \\e (4)
|
||||||
|
| (3) (5)
|
||||||
|
\\j /j
|
||||||
|
(4)
|
||||||
|
One or more of the try/except may have no further control flow.
|
||||||
|
However, if both have successors, they must go to the same place.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"try_header": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("NOP"),
|
||||||
|
natural_edge=TemplateEdge(source="try_header", dest="try_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="try_body", dest="try_footer", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_footer": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="try_footer", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExceptAsNonMatchSubTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
)
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_header: ControlFlowTemplate, try_body: ControlFlowTemplate, try_footer: ControlFlowTemplate, except_body: ControlFlowTemplate, panic_except: ControlFlowTemplate):
|
||||||
|
self.try_header = try_header
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_footer = try_footer
|
||||||
|
self.except_body = except_body
|
||||||
|
self.panic_except = panic_except
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryExceptTemplate311._subgraph, root_key="try_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
reduced_cfg: nx.DiGraph = cfg.copy()
|
||||||
|
# "bite off" the NOP from a linear sequence template
|
||||||
|
if isinstance(mapping["try_header"], LinearSequenceTemplate):
|
||||||
|
# grab the nop and update the linear sequence
|
||||||
|
nop_inst_template = mapping["try_header"].members[-1]
|
||||||
|
mapping["try_header"].members = mapping["try_header"].members[:-1]
|
||||||
|
if len(mapping["try_header"].members) == 1:
|
||||||
|
nx.relabel_nodes(reduced_cfg, {mapping["try_header"]: mapping["try_header"].members[0]}, copy=False)
|
||||||
|
mapping["try_header"] = mapping["try_header"].members[0]
|
||||||
|
|
||||||
|
# transfer outgoing edges to the bitten off chunk
|
||||||
|
header_out_edges = list(reduced_cfg.out_edges(mapping["try_header"], data=True))
|
||||||
|
reduced_cfg.add_node(nop_inst_template)
|
||||||
|
reduced_cfg.remove_edges_from(header_out_edges)
|
||||||
|
reduced_cfg.add_edges_from((nop_inst_template, dst, data) for src, dst, data in header_out_edges)
|
||||||
|
reduced_cfg.add_edge(mapping["try_header"], nop_inst_template, type=ControlFlowEdgeType.NATURAL.value)
|
||||||
|
mapping["try_header"] = nop_inst_template
|
||||||
|
|
||||||
|
try_except_template = TryExceptTemplate311(try_header=mapping["try_header"], try_body=mapping["try_body"], try_footer=mapping.get("try_footer", None), except_body=mapping["except_body"], panic_except=mapping["panic_except"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in reduced_cfg.in_edges(try_except_template.try_header, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
# insert a continuation edge to after the try except
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
if "after_try_except" in mapping.keys():
|
||||||
|
after_try_except = mapping["after_try_except"]
|
||||||
|
out_edges.append((try_except_template, after_try_except, {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_header, try_except_template.try_body, try_except_template.try_footer, try_except_template.except_body, try_except_template.panic_except])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
try_header = self.try_header.to_indented_source(source_lines)
|
||||||
|
try_body = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
# check if there is a try footer as in 3.7 there may not be a try footer at all
|
||||||
|
if self.try_footer:
|
||||||
|
try_footer = ControlFlowTemplate._indent_multiline_string(self.try_footer.to_indented_source(source_lines))
|
||||||
|
else:
|
||||||
|
try_footer = ""
|
||||||
|
|
||||||
|
try_except_lines = [try_header, "try:", try_body, try_footer]
|
||||||
|
|
||||||
|
# if we matched against an "Except ... as" chain, then omit the inserted except: block
|
||||||
|
omit_except = False
|
||||||
|
if isinstance(self.except_body, AbstractExceptionBlockTemplate):
|
||||||
|
omit_except = True
|
||||||
|
elif isinstance(self.except_body, LoopExitTemplate):
|
||||||
|
if isinstance(self.except_body.tail, AbstractExceptionBlockTemplate):
|
||||||
|
omit_except = True
|
||||||
|
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
if not omit_except:
|
||||||
|
try_except_lines.append("except:")
|
||||||
|
except_body = ControlFlowTemplate._indent_multiline_string(except_body)
|
||||||
|
|
||||||
|
try_except_lines.append(except_body)
|
||||||
|
|
||||||
|
# the panic except should never have a line
|
||||||
|
|
||||||
|
return "\n".join(try_except_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+178
@@ -0,0 +1,178 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
from .ExceptTemplate311 import ExceptTemplate311
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import (
|
||||||
|
assert_edge_type,
|
||||||
|
optional_node,
|
||||||
|
optional_edge,
|
||||||
|
assert_in_degree,
|
||||||
|
node_match_all,
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
assert_instruction_opname,
|
||||||
|
is_exactly_opname,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class TryTemplate311(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"try_header": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("NOP"),
|
||||||
|
natural_edge=TemplateEdge(source="try_header", dest="try_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(source="try_body", dest="try_footer", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"try_footer": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_in_degree(1),
|
||||||
|
assert_node_has_no_backwards_edges,
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(source="try_footer", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_footer",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExceptTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_header: ControlFlowTemplate, try_body: ControlFlowTemplate, try_footer: ControlFlowTemplate, except_body: ControlFlowTemplate, panic_except: ControlFlowTemplate):
|
||||||
|
self.try_header = try_header
|
||||||
|
self.try_body = try_body
|
||||||
|
self.try_footer = try_footer
|
||||||
|
self.except_body = except_body
|
||||||
|
self.panic_except = panic_except
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryTemplate311._subgraph, root_key="try_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
|
||||||
|
# "bite off" the NOP from a linear sequence template
|
||||||
|
if isinstance(mapping["try_header"], LinearSequenceTemplate):
|
||||||
|
# grab the nop and update the linear sequence
|
||||||
|
nop_inst_template = mapping["try_header"].members[-1]
|
||||||
|
mapping["try_header"].members = mapping["try_header"].members[:-1]
|
||||||
|
if len(mapping["try_header"].members) == 1:
|
||||||
|
nx.relabel_nodes(reduced_cfg, {mapping["try_header"]: mapping["try_header"].members[0]}, copy=False)
|
||||||
|
mapping["try_header"] = mapping["try_header"].members[0]
|
||||||
|
|
||||||
|
# transfer outgoing edges to the bitten off chunk
|
||||||
|
header_out_edges = list(reduced_cfg.out_edges(mapping["try_header"], data=True))
|
||||||
|
reduced_cfg.add_node(nop_inst_template)
|
||||||
|
reduced_cfg.remove_edges_from(header_out_edges)
|
||||||
|
reduced_cfg.add_edges_from((nop_inst_template, dst, data) for src, dst, data in header_out_edges)
|
||||||
|
reduced_cfg.add_edge(mapping["try_header"], nop_inst_template, type=ControlFlowEdgeType.NATURAL.value)
|
||||||
|
mapping["try_header"] = nop_inst_template
|
||||||
|
|
||||||
|
try_except_template = TryTemplate311(try_header=mapping["try_header"], try_body=mapping["try_body"], try_footer=mapping["try_footer"], except_body=mapping["except_body"], panic_except=mapping["panic_except"])
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in reduced_cfg.in_edges(try_except_template.try_header, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
if "after_try_except" in mapping.keys():
|
||||||
|
after_try_except = mapping["after_try_except"]
|
||||||
|
out_edges.append((try_except_template, after_try_except, {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_header, try_except_template.try_body, try_except_template.try_footer, try_except_template.except_body, try_except_template.panic_except])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
try_header = self.try_header.to_indented_source(source_lines)
|
||||||
|
try_body = self._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
lines = [try_header, "try:", try_body, except_body]
|
||||||
|
|
||||||
|
try_footer = self.try_footer.to_indented_source(source_lines)
|
||||||
|
if try_footer.strip():
|
||||||
|
lines.extend(["else: # inserted", self._indent_multiline_string(try_footer)])
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+169
@@ -0,0 +1,169 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
|
||||||
|
from .ExceptTemplate311 import ExceptTemplate311
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import (
|
||||||
|
assert_edge_type,
|
||||||
|
optional_node,
|
||||||
|
optional_edge,
|
||||||
|
assert_in_degree,
|
||||||
|
assert_instruction_opname,
|
||||||
|
is_exactly_opname,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class TryTemplate312(ControlFlowTemplate):
|
||||||
|
_subgraph = {
|
||||||
|
"try_header": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("NOP"),
|
||||||
|
natural_edge=TemplateEdge(source="try_header", dest="try_body", edge_verification_func=assert_edge_type(ControlFlowEdgeType.NATURAL)),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="after_try_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
subtemplate=ExceptTemplate311,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="after_try_except", edge_verification_func=optional_edge, commit_none_to_mapping=False),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="panic_except",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"panic_except": TemplateNode(
|
||||||
|
node_verification_func=is_exactly_opname("COPY", "POP_EXCEPT", "RERAISE"),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"after_try_except": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="after_try_except",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, try_header: ControlFlowTemplate, try_body: ControlFlowTemplate, except_body: ControlFlowTemplate, panic_except: ControlFlowTemplate):
|
||||||
|
self.try_header = try_header
|
||||||
|
self.try_body = try_body
|
||||||
|
self.except_body = except_body
|
||||||
|
self.panic_except = panic_except
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=TryTemplate312._subgraph, root_key="try_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
|
||||||
|
# "bite off" the NOP from a linear sequence template
|
||||||
|
if isinstance(mapping["try_header"], LinearSequenceTemplate):
|
||||||
|
# grab the nop and update the linear sequence
|
||||||
|
nop_inst_template = mapping["try_header"].members[-1]
|
||||||
|
mapping["try_header"].members = mapping["try_header"].members[:-1]
|
||||||
|
if len(mapping["try_header"].members) == 1:
|
||||||
|
nx.relabel_nodes(reduced_cfg, {mapping["try_header"]: mapping["try_header"].members[0]}, copy=False)
|
||||||
|
mapping["try_header"] = mapping["try_header"].members[0]
|
||||||
|
|
||||||
|
# transfer outgoing edges to the bitten off chunk
|
||||||
|
header_out_edges = list(reduced_cfg.out_edges(mapping["try_header"], data=True))
|
||||||
|
reduced_cfg.add_node(nop_inst_template)
|
||||||
|
reduced_cfg.remove_edges_from(header_out_edges)
|
||||||
|
reduced_cfg.add_edges_from((nop_inst_template, dst, data) for src, dst, data in header_out_edges)
|
||||||
|
reduced_cfg.add_edge(mapping["try_header"], nop_inst_template, type=ControlFlowEdgeType.NATURAL.value)
|
||||||
|
mapping["try_header"] = nop_inst_template
|
||||||
|
|
||||||
|
try_except_template = TryTemplate312(
|
||||||
|
try_header=mapping["try_header"],
|
||||||
|
try_body=mapping["try_body"],
|
||||||
|
except_body=mapping["except_body"],
|
||||||
|
panic_except=mapping["panic_except"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, try_except_template, edge_properties) for src, dst, edge_properties in reduced_cfg.in_edges(try_except_template.try_header, data=True))
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((try_except_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
if "after_try_except" in mapping.keys():
|
||||||
|
after_try_except = mapping["after_try_except"]
|
||||||
|
out_edges.append((try_except_template, after_try_except, {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg.remove_nodes_from([try_except_template.try_header, try_except_template.try_body, try_except_template.except_body, try_except_template.panic_except])
|
||||||
|
reduced_cfg.add_node(try_except_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
try_header = self.try_header.to_indented_source(source_lines)
|
||||||
|
try_body = self._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
except_body = self.except_body.to_indented_source(source_lines)
|
||||||
|
|
||||||
|
lines = [try_header, "try:", try_body, except_body]
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+165
@@ -0,0 +1,165 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
from ...abstract.AbstractExceptionBlockTemplate import AbstractExceptionBlockTemplate
|
||||||
|
from ...natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
from ...try_except.pre_39.TryFinallyPre39 import Pre39TryFinallyTemplate
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, assert_except_as
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class Pre39ExceptAsTemplate(ControlFlowTemplate, AbstractNonSequentiable, AbstractExceptionBlockTemplate):
|
||||||
|
"""
|
||||||
|
An `except as` block, after its cleanup has been structured.
|
||||||
|
If there are multiple, this will match the last block in the series and set up the next one to be matched
|
||||||
|
(0)
|
||||||
|
/ \\j --> (012)
|
||||||
|
(1) (2) |j
|
||||||
|
|j (3)
|
||||||
|
(3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"except_as_header": TemplateNode(
|
||||||
|
node_verification_func=assert_except_as,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_as_header",
|
||||||
|
dest="except_setup",
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(source="except_as_header", dest="non_match_path"),
|
||||||
|
exception_edge=TemplateEdge(source="except_as_header", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_setup": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="except_setup",
|
||||||
|
dest="except_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="except_setup", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"except_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(source="except_body", dest="begin_finally", edge_verification_func=optional_edge),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="except_body",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"begin_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="cleanup",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="begin_finally", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"cleanup": TemplateNode(
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="cleanup",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"non_match_path": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="non_match_path",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, except_as_header: ControlFlowTemplate, except_setup: ControlFlowTemplate, except_body: ControlFlowTemplate, begin_finally: ControlFlowTemplate, cleanup: ControlFlowTemplate, non_match_path: ControlFlowTemplate):
|
||||||
|
self.except_as_header = except_as_header
|
||||||
|
self.except_setup = except_setup
|
||||||
|
self.except_body = except_body
|
||||||
|
self.begin_finally = begin_finally
|
||||||
|
self.cleanup = cleanup
|
||||||
|
self.non_match_path = non_match_path
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(template_node_dict=Pre39ExceptAsTemplate._subgraph, root_key="except_as_header", mapping_verification_func=None)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
except_as_cleanup_template = Pre39ExceptAsTemplate(
|
||||||
|
except_as_header=mapping["except_as_header"], except_setup=mapping["except_setup"], except_body=mapping["except_body"], begin_finally=mapping["begin_finally"], cleanup=mapping["cleanup"], non_match_path=mapping["non_match_path"]
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = ((src, except_as_cleanup_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(node, data=True))
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((except_as_cleanup_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(
|
||||||
|
[
|
||||||
|
except_as_cleanup_template.except_as_header,
|
||||||
|
except_as_cleanup_template.except_setup,
|
||||||
|
except_as_cleanup_template.except_body,
|
||||||
|
except_as_cleanup_template.begin_finally,
|
||||||
|
except_as_cleanup_template.cleanup,
|
||||||
|
except_as_cleanup_template.non_match_path,
|
||||||
|
]
|
||||||
|
)
|
||||||
|
reduced_cfg.add_node(except_as_cleanup_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
header = self.except_as_header.to_indented_source(source_lines)
|
||||||
|
if isinstance(self.except_body, LinearSequenceTemplate):
|
||||||
|
assert isinstance(self.except_body[0], Pre39TryFinallyTemplate)
|
||||||
|
_body = self.except_body[0].try_body.to_indented_source(source_lines)
|
||||||
|
else:
|
||||||
|
assert isinstance(self.except_body, Pre39TryFinallyTemplate)
|
||||||
|
_body = self.except_body.try_body.to_indented_source(source_lines)
|
||||||
|
body = ControlFlowTemplate._indent_multiline_string(_body)
|
||||||
|
non_match = self.non_match_path.to_indented_source(source_lines)
|
||||||
|
return f"{header}\n{body}\n{non_match}"
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+211
@@ -0,0 +1,211 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, assert_instruction_opname, node_match_none, node_match_all, contains_opname_sequence
|
||||||
|
|
||||||
|
from ...subtemplates.OptionalExitSubtemplate import ExitSubTemplate
|
||||||
|
|
||||||
|
|
||||||
|
class Pre39TryFinallyExitTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
r"""
|
||||||
|
A `try` block with only a `finally` following it. But 3.8 and below. This has a similar structure to the with template.
|
||||||
|
(0) only here because could not figure out a way to condense an exit without killing off the tail
|
||||||
|
|
|
||||||
|
(1)
|
||||||
|
/ e\ --> (0123)
|
||||||
|
(2) \ |
|
||||||
|
\ / (4)
|
||||||
|
(3)
|
||||||
|
does not cover additional finally blocks that will be inserted in the bytecode as a result of returns / breaking out of loops
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_finally": TemplateNode(
|
||||||
|
node_verification_func=node_match_all(
|
||||||
|
assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
node_match_none(
|
||||||
|
contains_opname_sequence(
|
||||||
|
"POP_TOP",
|
||||||
|
"STORE_FAST",
|
||||||
|
"POP_TOP",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="try_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="setup_finally", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="begin_finally",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="finally",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"begin_finally": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="finally",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"finally": TemplateNode(
|
||||||
|
subtemplate=ExitSubTemplate,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="finally",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
setup_finally: ControlFlowTemplate,
|
||||||
|
try_body: ControlFlowTemplate,
|
||||||
|
begin_finally: ControlFlowTemplate,
|
||||||
|
_finally: ControlFlowTemplate,
|
||||||
|
):
|
||||||
|
self.setup_finally = setup_finally
|
||||||
|
self.try_body = try_body
|
||||||
|
self.begin_finally = begin_finally
|
||||||
|
self._finally = _finally
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cfg.in_degree(node) != 1:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# to avoid being treated as a try-except, we actually need to greedily search up one layer
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=Pre39TryFinallyExitTemplate._subgraph,
|
||||||
|
root_key="setup_finally",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
finally_template = Pre39TryFinallyExitTemplate(
|
||||||
|
setup_finally=mapping["setup_finally"],
|
||||||
|
try_body=mapping["try_body"],
|
||||||
|
begin_finally=mapping["begin_finally"],
|
||||||
|
_finally=mapping["finally"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = [(src, finally_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(finally_template.setup_finally, data=True)]
|
||||||
|
# only preserve exception handling edges
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((finally_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
# if there is a tail add a natural out edge
|
||||||
|
if mapping.get("tail", None):
|
||||||
|
out_edges.append((finally_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from(
|
||||||
|
[
|
||||||
|
finally_template.setup_finally,
|
||||||
|
finally_template.try_body,
|
||||||
|
finally_template.begin_finally,
|
||||||
|
finally_template._finally,
|
||||||
|
]
|
||||||
|
)
|
||||||
|
reduced_cfg.add_node(finally_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
# sometimes the setup finally is included in a linear sequence, so we need to include that source
|
||||||
|
setup_finally = self.setup_finally.to_indented_source(source_lines)
|
||||||
|
try_block = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
# pick one of the finally bodies to get the source code from
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self._finally.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
if not finally_body:
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self._finally.to_indented_source(source_lines))
|
||||||
|
finally_lines = [setup_finally, "try:", try_block, "finally:", finally_body]
|
||||||
|
return "\n".join(finally_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
+188
@@ -0,0 +1,188 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
|
||||||
|
from ...abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
from ...abstract.AbstractNonSequentiableTemplate import AbstractNonSequentiable
|
||||||
|
|
||||||
|
from ....cfg_utils import ControlFlowEdgeType
|
||||||
|
|
||||||
|
from ...Subgraph import TemplateEdge, TemplateNode, GraphTemplateMatcher
|
||||||
|
|
||||||
|
from ...match_utils import optional_node, optional_edge, assert_in_degree, assert_instruction_opname
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
class Pre39TryFinallyTemplate(ControlFlowTemplate, AbstractNonSequentiable):
|
||||||
|
r"""
|
||||||
|
A `try` block with only a `finally` following it. But 3.8 and below. This has a similar structure to the with template
|
||||||
|
(0)
|
||||||
|
|
|
||||||
|
(1)
|
||||||
|
/ e\ --> (0123)
|
||||||
|
(2) \ |
|
||||||
|
\ / (4)
|
||||||
|
(3)
|
||||||
|
does not cover additional finally blocks that will be inserted in the bytecode as a result of returns / breaking out of loops
|
||||||
|
"""
|
||||||
|
|
||||||
|
_subgraph = {
|
||||||
|
"setup_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_instruction_opname("SETUP_FINALLY"),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="setup_finally",
|
||||||
|
dest="try_body",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(source="setup_finally", dest="outer_exception_handler", edge_verification_func=optional_edge),
|
||||||
|
),
|
||||||
|
"try_body": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="begin_finally",
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="try_body",
|
||||||
|
dest="finally",
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"begin_finally": TemplateNode(
|
||||||
|
node_verification_func=assert_in_degree(1),
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="finally",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="begin_finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"finally": TemplateNode(
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="finally",
|
||||||
|
dest="tail",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
commit_none_to_mapping=False,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="finally",
|
||||||
|
dest="outer_exception_handler",
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"tail": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="tail",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
"outer_exception_handler": TemplateNode(
|
||||||
|
node_verification_func=optional_node,
|
||||||
|
natural_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
exception_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
conditional_edge=TemplateEdge(
|
||||||
|
source="outer_exception_handler",
|
||||||
|
dest=None,
|
||||||
|
edge_verification_func=optional_edge,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self, setup_finally: ControlFlowTemplate, try_body: ControlFlowTemplate, begin_finally: ControlFlowTemplate, _finally: ControlFlowTemplate):
|
||||||
|
self.setup_finally = setup_finally
|
||||||
|
self.try_body = try_body
|
||||||
|
self.begin_finally = begin_finally
|
||||||
|
self._finally = _finally
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def try_to_match_node(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
"""
|
||||||
|
Attempts to match this template on the graph at the given node.
|
||||||
|
If successful, returns an updated cfg with the appropriate nodes condensed into an instance of this template.
|
||||||
|
Otherwise, returns None.
|
||||||
|
"""
|
||||||
|
if node not in cfg.nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cfg.in_degree(node) != 1:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# to avoid being treated as a try-except, we actually need to greedily search up one layer
|
||||||
|
node = next(cfg.predecessors(node))
|
||||||
|
|
||||||
|
matcher = GraphTemplateMatcher(
|
||||||
|
template_node_dict=Pre39TryFinallyTemplate._subgraph,
|
||||||
|
root_key="setup_finally",
|
||||||
|
mapping_verification_func=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
mapping = matcher.match_at_graph_node(cfg, node)
|
||||||
|
|
||||||
|
if not mapping:
|
||||||
|
return None
|
||||||
|
|
||||||
|
finally_template = Pre39TryFinallyTemplate(
|
||||||
|
setup_finally=mapping["setup_finally"],
|
||||||
|
try_body=mapping["try_body"],
|
||||||
|
begin_finally=mapping["begin_finally"],
|
||||||
|
_finally=mapping["finally"],
|
||||||
|
)
|
||||||
|
|
||||||
|
in_edges = [(src, finally_template, edge_properties) for src, dst, edge_properties in cfg.in_edges(finally_template.setup_finally, data=True)]
|
||||||
|
|
||||||
|
# only preserve exception handling edges
|
||||||
|
|
||||||
|
out_edges = []
|
||||||
|
if mapping["outer_exception_handler"]:
|
||||||
|
out_edges.append((finally_template, mapping["outer_exception_handler"], {"type": ControlFlowEdgeType.EXCEPTION.value}))
|
||||||
|
|
||||||
|
# if there is a tail add it as an out edge
|
||||||
|
if mapping.get("tail", None):
|
||||||
|
out_edges.append((finally_template, mapping["tail"], {"type": ControlFlowEdgeType.NATURAL.value}))
|
||||||
|
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
reduced_cfg.remove_nodes_from([finally_template.setup_finally, finally_template.try_body, finally_template.begin_finally, finally_template._finally])
|
||||||
|
reduced_cfg.add_node(finally_template)
|
||||||
|
reduced_cfg.add_edges_from(itertools.chain(in_edges, out_edges))
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
def to_indented_source(self, source_lines: list[str]) -> str:
|
||||||
|
"""
|
||||||
|
Returns the source code for this template, recursively calling into its children to create the full source code.
|
||||||
|
"""
|
||||||
|
# sometimes the setup finally is included in a linear sequence, so we need to include that source
|
||||||
|
setup_finally = self.setup_finally.to_indented_source(source_lines)
|
||||||
|
try_block = ControlFlowTemplate._indent_multiline_string(self.try_body.to_indented_source(source_lines))
|
||||||
|
# pick one of the finally bodies to get the source code from
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self._finally.to_indented_source(source_lines))
|
||||||
|
|
||||||
|
if not finally_body:
|
||||||
|
finally_body = ControlFlowTemplate._indent_multiline_string(self._finally.to_indented_source(source_lines))
|
||||||
|
finally_lines = [setup_finally, "try:", try_block, "finally:", finally_body]
|
||||||
|
return "\n".join(finally_lines)
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return super().__repr__()
|
||||||
@@ -0,0 +1,114 @@
|
|||||||
|
from typing import Hashable
|
||||||
|
|
||||||
|
|
||||||
|
def postprocess(source_lines: list[str]):
|
||||||
|
i = 0
|
||||||
|
tab = " " * 4
|
||||||
|
decs = []
|
||||||
|
while i < len(source_lines):
|
||||||
|
line = source_lines[i]
|
||||||
|
if line.startswith("global ") or line.startswith("nonlocal "):
|
||||||
|
decs.append(i)
|
||||||
|
elif line.startswith("__doc__ = "):
|
||||||
|
source_lines[i] = line[10:]
|
||||||
|
# should check for 'from __future__ import ', but lines are still masked now
|
||||||
|
# checking only for 'from ' doesn't make a difference though
|
||||||
|
elif not line.startswith('"""') and not line.startswith("from "):
|
||||||
|
break
|
||||||
|
i += 1
|
||||||
|
for dec in reversed(decs):
|
||||||
|
source_lines.insert(i - 1, source_lines.pop(dec))
|
||||||
|
block_level = None
|
||||||
|
can_have_return = [False]
|
||||||
|
|
||||||
|
while i < len(source_lines):
|
||||||
|
line = source_lines[i]
|
||||||
|
tabs = len(line) - len(line.lstrip("\t"))
|
||||||
|
inserted = line.endswith("# inserted")
|
||||||
|
if inserted:
|
||||||
|
line = line[:-10]
|
||||||
|
line = line.strip()
|
||||||
|
while len(can_have_return) - 1 > tabs:
|
||||||
|
can_have_return.pop()
|
||||||
|
if line.startswith("def ") or line.startswith("class ") or line.startswith("async def "):
|
||||||
|
while len(can_have_return) - 1 < tabs + 1:
|
||||||
|
can_have_return.append(can_have_return[-1])
|
||||||
|
can_have_return[tabs + 1] = not line.startswith("class")
|
||||||
|
# add newline between function and class defs
|
||||||
|
if line.startswith("@") or line.startswith("def ") or line.startswith("class ") or line.startswith("async def "):
|
||||||
|
if i and not source_lines[i - 1].strip().startswith("@") and block_level is None:
|
||||||
|
source_lines.insert(i, "")
|
||||||
|
i += 1
|
||||||
|
if (line.startswith("return ") or line == "return") and not can_have_return[min(tabs, len(can_have_return) - 1)]:
|
||||||
|
source_lines.pop(i)
|
||||||
|
continue
|
||||||
|
# insert pass in empty blocks
|
||||||
|
if block_level is not None:
|
||||||
|
if tabs <= block_level or not line.strip():
|
||||||
|
if source_lines[i - 1].strip().startswith("while True:"):
|
||||||
|
prev_line = source_lines[i - 1]
|
||||||
|
source_lines[i - 1] = " " * (len(prev_line) - len(prev_line.lstrip(" "))) + "pass"
|
||||||
|
else:
|
||||||
|
source_lines.insert(i, tab * (block_level + 1) + "pass # postinserted")
|
||||||
|
i += 1
|
||||||
|
block_level = None
|
||||||
|
if line.endswith(":"):
|
||||||
|
block_level = tabs
|
||||||
|
|
||||||
|
# convert tabs to spaces
|
||||||
|
source_lines[i] = tab * tabs + line + (" # inserted" if inserted else "")
|
||||||
|
i += 1
|
||||||
|
if block_level is not None:
|
||||||
|
source_lines.insert(i, tab * (block_level + 1) + "pass # postinserted")
|
||||||
|
i += 1
|
||||||
|
return "\n".join(source_lines)
|
||||||
|
|
||||||
|
|
||||||
|
def reconstruct_source(pyc, sources):
|
||||||
|
merged_source, blame = merge_indented_sources(pyc, sources)
|
||||||
|
return postprocess(merged_source), blame
|
||||||
|
|
||||||
|
|
||||||
|
def split_newlines(li):
|
||||||
|
return "\n".join(li).split("\n")
|
||||||
|
|
||||||
|
|
||||||
|
def indent_newlines(li, n=1):
|
||||||
|
li = [line for line in split_newlines(li)]
|
||||||
|
return ["\t" * n + line for line in li]
|
||||||
|
|
||||||
|
|
||||||
|
def merge_indented_sources(pyc, sources):
|
||||||
|
blame_dict = {}
|
||||||
|
for bytecode in pyc.child_bytecodes:
|
||||||
|
sources[bytecode.codeobj], blame_dict[bytecode.codeobj] = merge_indented_sources(bytecode, sources)
|
||||||
|
line = 0
|
||||||
|
indented_source = split_newlines(sources[pyc.codeobj])
|
||||||
|
blame = [pyc.codeobj] * len(indented_source)
|
||||||
|
lines_set = set()
|
||||||
|
for i, instruction in enumerate(pyc.ordered_instructions):
|
||||||
|
if instruction.starts_line and instruction.starts_line not in lines_set:
|
||||||
|
lines_set.add(instruction.starts_line)
|
||||||
|
# count implicit else/finally/while True
|
||||||
|
while line < len(indented_source) and indented_source[line].endswith("# inserted"):
|
||||||
|
line += 1
|
||||||
|
line += 1
|
||||||
|
if instruction.opname == "LOAD_CONST" and isinstance(instruction.argval, Hashable):
|
||||||
|
if instruction.argval in sources:
|
||||||
|
if instruction.argval.co_name not in ("<listcomp>", "<genexpr>", "<setcomp>", "<dictcomp>", "<lambda>"):
|
||||||
|
new_tabs = 1
|
||||||
|
|
||||||
|
# add indentation of previous line
|
||||||
|
prev_line = ""
|
||||||
|
if line > 0:
|
||||||
|
if line < len(indented_source):
|
||||||
|
prev_line = indented_source[line - 1]
|
||||||
|
else:
|
||||||
|
prev_line = indented_source[-1]
|
||||||
|
new_tabs += len(prev_line) - len(prev_line.lstrip("\t"))
|
||||||
|
|
||||||
|
code_to_insert = indent_newlines(sources[instruction.argval], new_tabs)
|
||||||
|
indented_source[line:line] = code_to_insert
|
||||||
|
blame[line:line] = blame_dict[instruction.argval]
|
||||||
|
line += len(code_to_insert)
|
||||||
|
return indented_source, blame
|
||||||
@@ -0,0 +1,452 @@
|
|||||||
|
import networkx as nx
|
||||||
|
|
||||||
|
import os
|
||||||
|
|
||||||
|
from pylingual.utils.lazy import lazy_import
|
||||||
|
|
||||||
|
from .cfg_utils import ControlFlowEdgeType, get_out_edge_dict, get_dominator_function
|
||||||
|
from pylingual.editable_bytecode import Inst
|
||||||
|
from pylingual.editable_bytecode import EditableBytecode
|
||||||
|
|
||||||
|
# abstract type
|
||||||
|
from .control_flow_templates.abstract.AbstractTemplate import ControlFlowTemplate
|
||||||
|
|
||||||
|
from .control_flow_templates.placeholders.IrreduciblePlaceholderTemplate import IrreduciblePlaceholderTemplate
|
||||||
|
|
||||||
|
# default flow
|
||||||
|
from .control_flow_templates.natural.InstructionTemplate import InstructionTemplate
|
||||||
|
from .control_flow_templates.natural.LinearSequenceTemplate import LinearSequenceTemplate
|
||||||
|
from .control_flow_templates.natural.LineTemplate import LineTemplate
|
||||||
|
|
||||||
|
# if/else
|
||||||
|
from .control_flow_templates.if_then.IfThenTemplate import IfThenTemplate
|
||||||
|
from .control_flow_templates.if_then.IfElseTemplate import IfElseTemplate
|
||||||
|
from .control_flow_templates.if_then.IfThenJumpTemplate import IfThenJumpTemplate
|
||||||
|
from .control_flow_templates.if_then.ConditionalExitTemplate import ConditionalExitTemplate
|
||||||
|
from .control_flow_templates.booleans.ShortCircuitOrTemplate import ShortCircuitOrTemplate
|
||||||
|
from .control_flow_templates.booleans.ShortCircuitOrContinueTemplate import ShortCircuitOrContinueTemplate
|
||||||
|
from .control_flow_templates.booleans.ShortCircuitAndTemplate import ShortCircuitAndTemplate
|
||||||
|
from .control_flow_templates.booleans.ChainedComparisonTemplate import ChainedComparisonTemplate
|
||||||
|
from .control_flow_templates.context_managers.WithTemplate import WithTemplate
|
||||||
|
from .control_flow_templates.context_managers.WithTemplate39 import WithTemplate39
|
||||||
|
from .control_flow_templates.context_managers.WithCleanup312 import WithCleanup312
|
||||||
|
from .control_flow_templates.context_managers.AsyncWithCleanup312 import AsyncWithCleanup312
|
||||||
|
from .control_flow_templates.context_managers.WithTemplate312 import WithTemplate312
|
||||||
|
from .control_flow_templates.context_managers.Await312Template import Await312Template
|
||||||
|
|
||||||
|
# loops
|
||||||
|
from .control_flow_templates.loop.LoopTemplate import LoopTemplate
|
||||||
|
from .control_flow_templates.loop.SelfLoopTemplate import SelfLoopTemplate
|
||||||
|
from .control_flow_templates.loop.LoopExitTemplate import LoopExitTemplate
|
||||||
|
from .control_flow_templates.loop.PreRefinedLoopTemplate import PreRefinedLoopTemplate
|
||||||
|
from .control_flow_templates.loop.RefinedLoopTemplate import RefinedLoopTemplate
|
||||||
|
from .control_flow_templates.loop.WhileTrueIfElseTemplate import WhileTrueIfElseTemplate
|
||||||
|
from .control_flow_templates.loop.AsyncForTemplate import AsyncForTemplate
|
||||||
|
from .control_flow_templates.loop.InlinedComprehension import InlinedComprehensionTemplate
|
||||||
|
from .control_flow_templates.loop.ForIf312Template import ForIf312Template
|
||||||
|
|
||||||
|
# exceptions
|
||||||
|
from .control_flow_templates.try_except.TryExceptTemplate import TryExceptTemplate
|
||||||
|
from .control_flow_templates.try_except.TryExceptElseTemplate import TryExceptElseTemplate
|
||||||
|
from .control_flow_templates.try_except.ExceptAsExceptTemplate import ExceptAsExceptTemplate
|
||||||
|
from .control_flow_templates.try_except.ExceptAsCleanup import ExceptAsCleanupTemplate
|
||||||
|
from .control_flow_templates.try_except.ExceptAsExitTemplate import ExceptAsExitTemplate
|
||||||
|
from .control_flow_templates.try_except.FinallyTemplate import FinallyTemplate
|
||||||
|
from .control_flow_templates.try_except.TryFinallyTemplate import TryFinallyTemplate
|
||||||
|
from .control_flow_templates.try_except.pre_39.TryFinallyPre39 import Pre39TryFinallyTemplate
|
||||||
|
from .control_flow_templates.try_except.pre_39.TryFinallyExitPre39 import Pre39TryFinallyExitTemplate
|
||||||
|
from .control_flow_templates.try_except.pre_39.ExceptAsPre39 import Pre39ExceptAsTemplate
|
||||||
|
from .control_flow_templates.try_except.ExceptException import ExceptException
|
||||||
|
from .control_flow_templates.try_except.GeneratorCleanupTemplate import GeneratorCleanupTemplate
|
||||||
|
|
||||||
|
# 3.11/3.12-specific exceptions
|
||||||
|
from .control_flow_templates.try_except.post_311.TryTemplate311 import TryTemplate311
|
||||||
|
from .control_flow_templates.try_except.post_311.TryTemplate312 import TryTemplate312
|
||||||
|
from .control_flow_templates.try_except.post_311.FinallyTemplate312 import FinallyTemplate312
|
||||||
|
|
||||||
|
import pathlib
|
||||||
|
|
||||||
|
lazy_import("pydot")
|
||||||
|
|
||||||
|
from typing import Generator, Any
|
||||||
|
|
||||||
|
|
||||||
|
def viz(graph, name, node_label="label"):
|
||||||
|
namepath = pathlib.Path(name)
|
||||||
|
dot = pydot.Dot(namepath.name)
|
||||||
|
nodes = {}
|
||||||
|
|
||||||
|
for node, data in graph.nodes.data():
|
||||||
|
n = pydot.Node(hash(node), label=data[node_label])
|
||||||
|
dot.add_node(n)
|
||||||
|
nodes[hash(node)] = n
|
||||||
|
|
||||||
|
for node1, node2, data in graph.edges.data():
|
||||||
|
edge = pydot.Edge(nodes[hash(node1)], nodes[hash(node2)], **data)
|
||||||
|
dot.add_edge(edge)
|
||||||
|
|
||||||
|
try:
|
||||||
|
dot.write_png(name)
|
||||||
|
except FileNotFoundError:
|
||||||
|
dot.write_raw(name.replace(".png", ".dot"))
|
||||||
|
|
||||||
|
|
||||||
|
# order matters!
|
||||||
|
# More specific templates should appear before more general templates for correctness
|
||||||
|
# More common templates should appear before more rare templates for efficiency
|
||||||
|
cyclic_templates: list[type[ControlFlowTemplate]] = [
|
||||||
|
WhileTrueIfElseTemplate,
|
||||||
|
LoopTemplate,
|
||||||
|
SelfLoopTemplate,
|
||||||
|
ShortCircuitOrTemplate, # the short circuit templates aren't cyclic, but are needed to match certain while loops
|
||||||
|
ShortCircuitAndTemplate,
|
||||||
|
]
|
||||||
|
|
||||||
|
# priority dict structure
|
||||||
|
# Template type : (pass number, priority number) # lower is earlier
|
||||||
|
acyclic_templates_priority_dict: dict[ControlFlowTemplate, tuple[int, int]] = {
|
||||||
|
RefinedLoopTemplate: (0, 0),
|
||||||
|
AsyncForTemplate: (0, 1), # technically a cyclic template, but it searches up one node to complete the loop
|
||||||
|
FinallyTemplate: (0, 10),
|
||||||
|
WithTemplate: (0, 11),
|
||||||
|
LinearSequenceTemplate: (0, 20),
|
||||||
|
ExceptAsExitTemplate: (0, 25),
|
||||||
|
ExceptAsExceptTemplate: (0, 27),
|
||||||
|
ShortCircuitOrContinueTemplate: (0, 30),
|
||||||
|
IfElseTemplate: (1, 43),
|
||||||
|
IfThenTemplate: (1, 44),
|
||||||
|
IfThenJumpTemplate: (0, 45),
|
||||||
|
ConditionalExitTemplate: (0, 46),
|
||||||
|
ExceptAsCleanupTemplate: (0, 50),
|
||||||
|
TryFinallyTemplate: (0, 60),
|
||||||
|
TryExceptElseTemplate: (0, 62),
|
||||||
|
ShortCircuitOrTemplate: (0, 70),
|
||||||
|
ShortCircuitAndTemplate: (0, 71),
|
||||||
|
ChainedComparisonTemplate: (0, 72),
|
||||||
|
}
|
||||||
|
|
||||||
|
# dictionary structure
|
||||||
|
# version: {template: (pass, priority)}
|
||||||
|
version_specific_acyclic_templates_dict: dict[tuple[int, int], dict[ControlFlowTemplate, tuple[int, int]]] = {
|
||||||
|
(3, 13): {
|
||||||
|
TryTemplate312: (-1, 60),
|
||||||
|
TryTemplate311: (-1, 61),
|
||||||
|
WithCleanup312: (-1, 0),
|
||||||
|
AsyncWithCleanup312: (-1, 0),
|
||||||
|
WithTemplate312: (0, 10),
|
||||||
|
InlinedComprehensionTemplate: (-1, 0),
|
||||||
|
GeneratorCleanupTemplate: (0, 1),
|
||||||
|
Await312Template: (0, 2),
|
||||||
|
ForIf312Template: (0, 0),
|
||||||
|
FinallyTemplate312: (-1, 199),
|
||||||
|
},
|
||||||
|
(3, 12): {
|
||||||
|
TryTemplate312: (-1, 60),
|
||||||
|
TryTemplate311: (-1, 61),
|
||||||
|
WithCleanup312: (-1, 0),
|
||||||
|
AsyncWithCleanup312: (-1, 0),
|
||||||
|
WithTemplate312: (0, 10),
|
||||||
|
InlinedComprehensionTemplate: (-1, 0),
|
||||||
|
GeneratorCleanupTemplate: (0, 1),
|
||||||
|
Await312Template: (0, 2),
|
||||||
|
ForIf312Template: (0, 0),
|
||||||
|
FinallyTemplate312: (-1, 199),
|
||||||
|
},
|
||||||
|
(3, 11): {
|
||||||
|
TryTemplate311: (-1, 55),
|
||||||
|
},
|
||||||
|
(3, 9): {
|
||||||
|
WithTemplate39: (0, 12),
|
||||||
|
TryExceptTemplate: (1, 61),
|
||||||
|
},
|
||||||
|
(3, 8): {
|
||||||
|
Pre39ExceptAsTemplate: (0, 40),
|
||||||
|
Pre39TryFinallyTemplate: (0, 60),
|
||||||
|
Pre39TryFinallyExitTemplate: (0, 75),
|
||||||
|
ExceptException: (0, 38),
|
||||||
|
TryExceptTemplate: (1, 61),
|
||||||
|
},
|
||||||
|
(3, 7): {
|
||||||
|
Pre39ExceptAsTemplate: (0, 40),
|
||||||
|
Pre39TryFinallyTemplate: (0, 60),
|
||||||
|
Pre39TryFinallyExitTemplate: (0, 75),
|
||||||
|
ExceptException: (0, 38),
|
||||||
|
TryExceptTemplate: (1, 61),
|
||||||
|
},
|
||||||
|
(3, 6): {
|
||||||
|
Pre39ExceptAsTemplate: (0, 40),
|
||||||
|
Pre39TryFinallyTemplate: (0, 60),
|
||||||
|
Pre39TryFinallyExitTemplate: (0, 75),
|
||||||
|
ExceptException: (0, 38),
|
||||||
|
TryExceptTemplate: (1, 61),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def get_acyclic_template_passes(version: tuple[int, int]) -> Generator[list[ControlFlowTemplate], None, None]:
|
||||||
|
pass_dict = dict()
|
||||||
|
# accumulate the passes, merging in version-specific templates
|
||||||
|
for template, (pass_number, priority) in (acyclic_templates_priority_dict | version_specific_acyclic_templates_dict.get(version, dict())).items():
|
||||||
|
pass_list = pass_dict.get(pass_number, list())
|
||||||
|
pass_list.append((template, priority))
|
||||||
|
pass_dict[pass_number] = pass_list
|
||||||
|
# sort each pass by priority
|
||||||
|
for pass_number, pass_list in pass_dict.items():
|
||||||
|
pass_dict[pass_number] = [template for template, priority in sorted(pass_list, key=lambda item: item[1])]
|
||||||
|
# yield the templates for each pass
|
||||||
|
for pass_number in sorted(pass_dict.keys()):
|
||||||
|
yield pass_dict[pass_number]
|
||||||
|
|
||||||
|
|
||||||
|
def visualize(graph: nx.DiGraph, name, suffix):
|
||||||
|
# visualization is slow
|
||||||
|
if os.environ.get("DEBUG_CFLOW", None) != "1":
|
||||||
|
return
|
||||||
|
for n in graph.nodes:
|
||||||
|
graph.nodes[n]["label"] = repr(n)
|
||||||
|
v = next(x for x in graph.nodes if not isinstance(x, str)).get_instructions()[0].bytecode.version
|
||||||
|
viz(graph, f"/tmp/graph/{name}_{v[1]}_{suffix}.png", edge_label="type")
|
||||||
|
|
||||||
|
|
||||||
|
def structure_loop(cfg: nx.DiGraph, node) -> nx.DiGraph:
|
||||||
|
dominates = get_dominator_function(cfg)
|
||||||
|
# a node is a loop header if there are back-edges to it
|
||||||
|
# a latching node is a node with a back-edge to the loop header
|
||||||
|
# a back-edge is an edge from any node that is dominated by this node
|
||||||
|
latching_nodes = [pred for pred in cfg.predecessors(node) if dominates(node, pred)]
|
||||||
|
if not latching_nodes:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# attempt to match a loop template
|
||||||
|
for template in cyclic_templates:
|
||||||
|
candidate_cfg = template.try_to_match_node(cfg, node)
|
||||||
|
if candidate_cfg is not None:
|
||||||
|
return candidate_cfg
|
||||||
|
|
||||||
|
if len(node.get_instructions()) == 1 and node.get_instructions()[0].opname == "SEND":
|
||||||
|
return None
|
||||||
|
|
||||||
|
# identify the canonical loop exit and outer exception handler by looking at the loop header
|
||||||
|
loop_header_edge_dict = get_out_edge_dict(cfg, node)
|
||||||
|
canonical_loop_exit, _ = loop_header_edge_dict["conditional"]
|
||||||
|
outer_exception_handler, _ = loop_header_edge_dict["exception"]
|
||||||
|
|
||||||
|
# subgraph containing all nodes dominated by the loop header
|
||||||
|
dominated_subgraph: nx.DiGraph = cfg.subgraph(n for n in cfg.nodes if dominates(node, n))
|
||||||
|
reverse_reachability_map = nx.single_source_shortest_path_length(dominated_subgraph.reverse(), source=node)
|
||||||
|
# a node is in the loop if there is a backwards path to the header that doesn't leave the loop
|
||||||
|
loop_nodes = [loop_node for loop_node, distance in reverse_reachability_map.items() if distance >= 0]
|
||||||
|
# extend loop nodes with their natural edges; you can't leave the loop without a jump of some kind
|
||||||
|
# also extend loop nodes with exception edges that do not leave the loop
|
||||||
|
natural_edges = [(u, v) for u, v, data in dominated_subgraph.edges(data=True) if data["type"] == ControlFlowEdgeType.NATURAL.value]
|
||||||
|
# also extend loop nodes with their conditional edges, excluding the loop header
|
||||||
|
conditional_edges = [(u, v) for u, v, data in dominated_subgraph.edges(data=True) if data["type"] in [ControlFlowEdgeType.TRUE_JUMP.value, ControlFlowEdgeType.FALSE_JUMP.value] and u != node]
|
||||||
|
internal_exception_edges = [(u, v) for u, v, data in dominated_subgraph.edges(data=True) if data["type"] == ControlFlowEdgeType.EXCEPTION.value and v is not outer_exception_handler]
|
||||||
|
natural_dominated_subgraph = dominated_subgraph.edge_subgraph(natural_edges + internal_exception_edges + conditional_edges)
|
||||||
|
loop_nodes = set(loop_nodes + [v for _, v in nx.edge_dfs(natural_dominated_subgraph, source=loop_nodes)])
|
||||||
|
|
||||||
|
# canonical loop exit can be misidentified in while trues that start with if statements
|
||||||
|
if canonical_loop_exit and any(exit_successor in loop_nodes for exit_successor in cfg.successors(canonical_loop_exit)):
|
||||||
|
canonical_loop_exit = None
|
||||||
|
|
||||||
|
# There are 4 kinds of exits:
|
||||||
|
# 1. canonical exit (the conditional branch from the loop header)
|
||||||
|
# 2. break statement
|
||||||
|
# 3. return statement
|
||||||
|
# 4. raised exception caught outside loop
|
||||||
|
loop_exit_edges = [(src, dst) for src, dst in cfg.edges if src in loop_nodes and dst not in loop_nodes and cfg.get_edge_data(src, dst)["type"] != ControlFlowEdgeType.META.value]
|
||||||
|
|
||||||
|
loop_successor = None
|
||||||
|
break_edges = []
|
||||||
|
for loop_node, exit_node in loop_exit_edges:
|
||||||
|
# skip the canonical exit
|
||||||
|
if loop_node is node and exit_node is canonical_loop_exit:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# skip exception edges to the outer handler
|
||||||
|
if cfg.get_edge_data(loop_node, exit_node)["type"] == ControlFlowEdgeType.EXCEPTION.value and exit_node is outer_exception_handler:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# all other cases are exhausted, so we are now only considering break statements
|
||||||
|
if loop_successor is None:
|
||||||
|
loop_successor = exit_node
|
||||||
|
elif loop_successor != exit_node:
|
||||||
|
if os.environ.get("DEBUG_CFLOW", None) == "1":
|
||||||
|
breakpoint()
|
||||||
|
raise RuntimeError("Found multiple break targets in the same loop!")
|
||||||
|
|
||||||
|
break_edges.append((loop_node, exit_node))
|
||||||
|
|
||||||
|
# if there are no break statements, then the successor is the canonical exit
|
||||||
|
# the canonical exit may be different in the case of a loop-else, but that only matters if there are breaks
|
||||||
|
if loop_successor is None:
|
||||||
|
loop_successor = canonical_loop_exit
|
||||||
|
|
||||||
|
# continue edges are all the latching nodes; may be explicit or implicit
|
||||||
|
continue_edges = [(src, node) for src in latching_nodes]
|
||||||
|
|
||||||
|
# if we found nothing to refine, then exit
|
||||||
|
if not continue_edges and not break_edges:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# reduce the break/continue edges
|
||||||
|
reduced_cfg = cfg.copy()
|
||||||
|
for continue_edge in set(continue_edges):
|
||||||
|
LoopExitTemplate.structure_edge_inplace(reduced_cfg, continue_edge, exit_statment="continue")
|
||||||
|
|
||||||
|
for break_edge in set(break_edges):
|
||||||
|
LoopExitTemplate.structure_edge_inplace(reduced_cfg, break_edge, exit_statment="break")
|
||||||
|
|
||||||
|
# partially structure the loop while we have the information available
|
||||||
|
# if the canonical exit is not the successor, then the canonical exit is a loop else
|
||||||
|
if canonical_loop_exit is not None and loop_successor is not None and canonical_loop_exit != loop_successor:
|
||||||
|
loop_else_out_edges = get_out_edge_dict(reduced_cfg, canonical_loop_exit)
|
||||||
|
if loop_else_out_edges["natural"] is not None and loop_else_out_edges["natural"][0] != loop_successor:
|
||||||
|
# todo: fix triple nested loop w else break
|
||||||
|
e = (canonical_loop_exit, loop_else_out_edges["natural"][0])
|
||||||
|
if dominates(e[1], e[0]):
|
||||||
|
# backwards edge
|
||||||
|
canonical_loop_exit = LoopExitTemplate.structure_edge_inplace(reduced_cfg, e, exit_statment="continue")
|
||||||
|
else:
|
||||||
|
canonical_loop_exit = LoopExitTemplate.structure_edge_inplace(reduced_cfg, e, exit_statment="break")
|
||||||
|
PreRefinedLoopTemplate.structure_nodes_inplace(reduced_cfg, loop_header=node, canonical_loop_exit=canonical_loop_exit, loop_successor=loop_successor)
|
||||||
|
|
||||||
|
return reduced_cfg
|
||||||
|
|
||||||
|
|
||||||
|
def get_line_out_edge_dict(cfg: nx.DiGraph, insts: list[Inst]) -> dict[str, tuple[Any, ControlFlowEdgeType]]:
|
||||||
|
# check that all outgoing edges of a given category have the same target
|
||||||
|
line_out_edge_dict = dict()
|
||||||
|
for inst in insts:
|
||||||
|
for edge_category, (edge_target, edge_data) in get_out_edge_dict(cfg, inst).items():
|
||||||
|
# skip considering internal control flow
|
||||||
|
if edge_target is None or edge_target in insts:
|
||||||
|
continue
|
||||||
|
# add edge to line-level mapping if this is the first time we've seen it
|
||||||
|
if edge_category not in line_out_edge_dict:
|
||||||
|
line_out_edge_dict[edge_category] = (edge_target, edge_data["type"])
|
||||||
|
# reject inconsistent mappings; this line cannot be condensed
|
||||||
|
elif edge_target != line_out_edge_dict[edge_category]:
|
||||||
|
return None
|
||||||
|
return line_out_edge_dict
|
||||||
|
|
||||||
|
|
||||||
|
def condense_lines(cfg: nx.DiGraph, bytecode: EditableBytecode) -> nx.DiGraph:
|
||||||
|
lno_insts = bytecode.get_lno_insts()
|
||||||
|
for line_number, insts in lno_insts.items():
|
||||||
|
insts = [inst for inst in insts if inst in cfg.nodes] # discard unreachable instructions
|
||||||
|
if not insts:
|
||||||
|
continue
|
||||||
|
line_in_edges = cfg.in_edges(nbunch=insts, data=True)
|
||||||
|
# check that no edges come from the outside to the middle of the line (sanity check)
|
||||||
|
incoming_edges = [(src, dst, data) for src, dst, data in line_in_edges if src not in insts]
|
||||||
|
if any(dst != insts[0] for src, dst, data in incoming_edges):
|
||||||
|
continue
|
||||||
|
|
||||||
|
line_out_edge_dict = get_line_out_edge_dict(cfg, insts)
|
||||||
|
if line_out_edge_dict is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# group up all the instructions in the line into a LineTemplate
|
||||||
|
line_template = LineTemplate(*[InstructionTemplate(inst) for inst in insts])
|
||||||
|
cfg.remove_nodes_from(insts)
|
||||||
|
cfg.add_node(line_template)
|
||||||
|
cfg.add_edges_from((src, line_template, data) for src, dst, data in incoming_edges)
|
||||||
|
for edge_category, (target, edge_type) in line_out_edge_dict.items():
|
||||||
|
cfg.add_edge(line_template, target, type=edge_type)
|
||||||
|
|
||||||
|
|
||||||
|
def condense_basic_blocks(cfg: nx.DiGraph) -> nx.DiGraph:
|
||||||
|
structured_cfg = cfg.copy()
|
||||||
|
for node in list(structured_cfg.nodes):
|
||||||
|
if node == "START":
|
||||||
|
continue
|
||||||
|
candidate_cfg = LinearSequenceTemplate.try_to_match_node(structured_cfg, node)
|
||||||
|
if candidate_cfg is not None:
|
||||||
|
structured_cfg = candidate_cfg
|
||||||
|
return structured_cfg
|
||||||
|
|
||||||
|
|
||||||
|
def structure_control_flow(cfg: nx.DiGraph, bytecode: EditableBytecode) -> ControlFlowTemplate:
|
||||||
|
# group lines with no weird control flow into LineTemplates
|
||||||
|
# currently reduces overall performance on 3.9
|
||||||
|
# condense_lines(cfg, bytecode)
|
||||||
|
|
||||||
|
# 1. wrap instructions globally
|
||||||
|
structured_cfg = InstructionTemplate.match_graph(cfg)
|
||||||
|
root_node = min([inst_template for inst_template in structured_cfg.nodes], key=lambda inst_template: inst_template.get_instructions()[0].offset)
|
||||||
|
structured_cfg.add_nodes_from(["START", "END"])
|
||||||
|
structured_cfg.add_edge("START", root_node, type="meta")
|
||||||
|
structured_cfg.add_edges_from((inst_template, "END", {"type": "meta"}) for inst_template in structured_cfg.nodes if isinstance(inst_template, InstructionTemplate) and inst_template.instruction.opname in ["RETURN_VALUE", "RETURN_CONST"])
|
||||||
|
|
||||||
|
modification_counter = 0
|
||||||
|
# 2. match linear sequences globally
|
||||||
|
structured_cfg = condense_basic_blocks(structured_cfg)
|
||||||
|
|
||||||
|
# 3. repeat until the graph has no non-meta edges
|
||||||
|
# 3a. Check for matches on loop templates
|
||||||
|
# 3b. Check for matches on non-loop templates
|
||||||
|
# 3c. Check for matches on exception templates
|
||||||
|
visualize(structured_cfg, bytecode.name, modification_counter)
|
||||||
|
|
||||||
|
def fully_structured(cfg: nx.DiGraph) -> bool:
|
||||||
|
# if there are any non-meta edges, the control flow is not fully structured
|
||||||
|
if any(edge_type != ControlFlowEdgeType.META.value for _, _, edge_type in structured_cfg.edges(data="type")):
|
||||||
|
return False
|
||||||
|
# if there is more than one node other than START and END, the control flow is not fully structured
|
||||||
|
if len(cfg) > 3:
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
infinite_loop_detection_threshold = 50
|
||||||
|
|
||||||
|
while not fully_structured(structured_cfg):
|
||||||
|
modified = False
|
||||||
|
for acyclic_templates in get_acyclic_template_passes(version=bytecode.version.as_tuple()):
|
||||||
|
current_num_nodes = len(structured_cfg.nodes)
|
||||||
|
for node in nx.dfs_postorder_nodes(structured_cfg, source="START"):
|
||||||
|
# don't process the start node
|
||||||
|
if node in ["START", "END"]:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if new_cfg := structure_loop(structured_cfg, node):
|
||||||
|
structured_cfg = new_cfg
|
||||||
|
modified = True
|
||||||
|
modification_counter += 1
|
||||||
|
visualize(structured_cfg, bytecode.name, modification_counter)
|
||||||
|
break
|
||||||
|
|
||||||
|
# check acyclic patterns if no cyclic pattern was matched
|
||||||
|
for template in acyclic_templates:
|
||||||
|
candidate_cfg = template.try_to_match_node(structured_cfg, node)
|
||||||
|
if candidate_cfg is not None:
|
||||||
|
structured_cfg = candidate_cfg
|
||||||
|
modified = True
|
||||||
|
modification_counter += 1
|
||||||
|
visualize(structured_cfg, bytecode.name, modification_counter)
|
||||||
|
break
|
||||||
|
|
||||||
|
if modified:
|
||||||
|
break
|
||||||
|
|
||||||
|
if modified:
|
||||||
|
break
|
||||||
|
|
||||||
|
if not modified:
|
||||||
|
# if in debug mode and template is irreducible breakpoint to inspect cfg
|
||||||
|
if os.environ.get("DEBUG_CFLOW", None) == "1":
|
||||||
|
breakpoint()
|
||||||
|
return IrreduciblePlaceholderTemplate("irreducible")
|
||||||
|
else:
|
||||||
|
new_num_nodes = len(structured_cfg)
|
||||||
|
if new_num_nodes >= current_num_nodes:
|
||||||
|
infinite_loop_detection_threshold -= 1
|
||||||
|
else:
|
||||||
|
infinite_loop_detection_threshold = 50
|
||||||
|
|
||||||
|
if infinite_loop_detection_threshold <= 0:
|
||||||
|
return IrreduciblePlaceholderTemplate("infinite grammar loop")
|
||||||
|
|
||||||
|
structured_cfg.remove_nodes_from(["START", "END"])
|
||||||
|
|
||||||
|
return list(structured_cfg.nodes)[0]
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user