Spaces:

williamyangwentao
/

Law_llama

Runtime error

App Files Files Community

williamyangwentao commited on Aug 23, 2023

Commit

14091f2

1 Parent(s): bd4066b

Upload folder using huggingface_hub

Browse files

Files changed (45) hide show

.gitignore +19 -0
LICENSE +674 -0
README.md +244 -8
assets/demo/demo.png +0 -0
assets/demo/demo07.jpeg +0 -0
assets/demo/example-01.jpeg +0 -0
assets/demo/example-02.jpeg +0 -0
assets/demo/example-03.jpeg +0 -0
assets/demo/example-04.jpeg +0 -0
assets/demo/example-05.jpeg +0 -0
assets/demo/example-06.jpeg +0 -0
assets/logo/lamda.png +0 -0
assets/logo/lawgpt.jpeg +0 -0
data/.gitkeep +0 -0
finetune.py +283 -0
infer.py +136 -0
merge.py +74 -0
models/base_models/.gitkeep +0 -0
models/lora_weights/.gitkeep +0 -0
outputs/.gitkeep +0 -0
requirements.txt +14 -0
resources/criminal_charges.json +0 -0
resources/example_infer_data.json +5 -0
resources/example_instruction_train.json +8 -0
resources/example_instruction_tune.json +12 -0
resources/legal_vocab.txt +0 -0
scripts/finetune.sh +56 -0
scripts/infer.sh +7 -0
scripts/merge.sh +4 -0
scripts/train_clm.sh +20 -0
scripts/webui.sh +28 -0
templates/alpaca.json +6 -0
templates/law_template.json +6 -0
tools/clear_law.py +78 -0
tools/merge_vocabulary.py +63 -0
train_clm.py +259 -0
utils/__init__.py +0 -0
utils/__pycache__/__init__.cpython-38.pyc +0 -0
utils/__pycache__/callbacks.cpython-38.pyc +0 -0
utils/__pycache__/prompter.cpython-38.pyc +0 -0
utils/callbacks.py +75 -0
utils/evaluate.py +196 -0
utils/merge.py +51 -0
utils/prompter.py +51 -0
webui.py +191 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,19 @@

+__pycache__/
+*.npy
+*.npz
+*.pyc
+*.pyd
+*.so
+*.ipynb
+.ipynb_checkpoints
+models/base_models/*
+!models/base_models/.gitkeep
+models/lora_weights/*
+!models/lora_weights/.gitkeep
+outputs/*
+!outputs/.gitkeep
+data/*
+!data/.gitkeep
+wandb/
+flagged/
+.DS_Store

LICENSE ADDED Viewed

	@@ -0,0 +1,674 @@

+                    GNU GENERAL PUBLIC LICENSE
+                       Version 3, 29 June 2007
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+                            Preamble
+  The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
+  The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works.  By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users.  We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors.  You can apply it to
+your programs, too.
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+  To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights.  Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received.  You must make sure that they, too, receive
+or can get the source code.  And you must show them these terms so they
+know their rights.
+  Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+  For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software.  For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+  Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so.  This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software.  The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable.  Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products.  If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+  Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary.  To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+  The precise terms and conditions for copying, distribution and
+modification follow.
+                       TERMS AND CONDITIONS
+  0. Definitions.
+  "This License" refers to version 3 of the GNU General Public License.
+  "Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+  "The Program" refers to any copyrightable work licensed under this
+License.  Each licensee is addressed as "you".  "Licensees" and
+"recipients" may be individuals or organizations.
+  To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy.  The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+  A "covered work" means either the unmodified Program or a work based
+on the Program.
+  To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy.  Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+  To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies.  Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+  An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that (1) displays an appropriate copyright notice, and (2)
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License.  If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+  1. Source Code.
+  The "source code" for a work means the preferred form of the work
+for making modifications to it.  "Object code" means any non-source
+form of a work.
+  A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+  The "System Libraries" of an executable work include anything, other
+than the work as a whole, that (a) is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and (b) serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form.  A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+  The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities.  However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work.  For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+  The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+  The Corresponding Source for a work in source code form is that
+same work.
+  2. Basic Permissions.
+  All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met.  This License explicitly affirms your unlimited
+permission to run the unmodified Program.  The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work.  This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+  You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force.  You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright.  Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+  Conveying under any other circumstances is permitted solely under
+the conditions stated below.  Sublicensing is not allowed; section 10
+makes it unnecessary.
+  3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+  No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+  When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+  4. Conveying Verbatim Copies.
+  You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+  You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+  5. Conveying Modified Source Versions.
+  You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+    a) The work must carry prominent notices stating that you modified
+    it, and giving a relevant date.
+    b) The work must carry prominent notices stating that it is
+    released under this License and any conditions added under section
+    7.  This requirement modifies the requirement in section 4 to
+    "keep intact all notices".
+    c) You must license the entire work, as a whole, under this
+    License to anyone who comes into possession of a copy.  This
+    License will therefore apply, along with any applicable section 7
+    additional terms, to the whole of the work, and all its parts,
+    regardless of how they are packaged.  This License gives no
+    permission to license the work in any other way, but it does not
+    invalidate such permission if you have separately received it.
+    d) If the work has interactive user interfaces, each must display
+    Appropriate Legal Notices; however, if the Program has interactive
+    interfaces that do not display Appropriate Legal Notices, your
+    work need not make them do so.
+  A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit.  Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+  6. Conveying Non-Source Forms.
+  You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+    a) Convey the object code in, or embodied in, a physical product
+    (including a physical distribution medium), accompanied by the
+    Corresponding Source fixed on a durable physical medium
+    customarily used for software interchange.
+    b) Convey the object code in, or embodied in, a physical product
+    (including a physical distribution medium), accompanied by a
+    written offer, valid for at least three years and valid for as
+    long as you offer spare parts or customer support for that product
+    model, to give anyone who possesses the object code either (1) a
+    copy of the Corresponding Source for all the software in the
+    product that is covered by this License, on a durable physical
+    medium customarily used for software interchange, for a price no
+    more than your reasonable cost of physically performing this
+    conveying of source, or (2) access to copy the
+    Corresponding Source from a network server at no charge.
+    c) Convey individual copies of the object code with a copy of the
+    written offer to provide the Corresponding Source.  This
+    alternative is allowed only occasionally and noncommercially, and
+    only if you received the object code with such an offer, in accord
+    with subsection 6b.
+    d) Convey the object code by offering access from a designated
+    place (gratis or for a charge), and offer equivalent access to the
+    Corresponding Source in the same way through the same place at no
+    further charge.  You need not require recipients to copy the
+    Corresponding Source along with the object code.  If the place to
+    copy the object code is a network server, the Corresponding Source
+    may be on a different server (operated by you or a third party)
+    that supports equivalent copying facilities, provided you maintain
+    clear directions next to the object code saying where to find the
+    Corresponding Source.  Regardless of what server hosts the
+    Corresponding Source, you remain obligated to ensure that it is
+    available for as long as needed to satisfy these requirements.
+    e) Convey the object code using peer-to-peer transmission, provided
+    you inform other peers where the object code and Corresponding
+    Source of the work are being offered to the general public at no
+    charge under subsection 6d.
+  A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+  A "User Product" is either (1) a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or (2) anything designed or sold for incorporation
+into a dwelling.  In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage.  For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product.  A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+  "Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source.  The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+  If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information.  But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+  The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed.  Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+  Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+  7. Additional Terms.
+  "Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law.  If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+  When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it.  (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.)  You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+  Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+    a) Disclaiming warranty or limiting liability differently from the
+    terms of sections 15 and 16 of this License; or
+    b) Requiring preservation of specified reasonable legal notices or
+    author attributions in that material or in the Appropriate Legal
+    Notices displayed by works containing it; or
+    c) Prohibiting misrepresentation of the origin of that material, or
+    requiring that modified versions of such material be marked in
+    reasonable ways as different from the original version; or
+    d) Limiting the use for publicity purposes of names of licensors or
+    authors of the material; or
+    e) Declining to grant rights under trademark law for use of some
+    trade names, trademarks, or service marks; or
+    f) Requiring indemnification of licensors and authors of that
+    material by anyone who conveys the material (or modified versions of
+    it) with contractual assumptions of liability to the recipient, for
+    any liability that these contractual assumptions directly impose on
+    those licensors and authors.
+  All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10.  If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term.  If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+  If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+  Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+  8. Termination.
+  You may not propagate or modify a covered work except as expressly
+provided under this License.  Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+  However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated (a)
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and (b) permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+  Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+  Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License.  If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+  9. Acceptance Not Required for Having Copies.
+  You are not required to accept this License in order to receive or
+run a copy of the Program.  Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance.  However,
+nothing other than this License grants you permission to propagate or
+modify any covered work.  These actions infringe copyright if you do
+not accept this License.  Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+  10. Automatic Licensing of Downstream Recipients.
+  Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License.  You are not responsible
+for enforcing compliance by third parties with this License.
+  An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations.  If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+  You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License.  For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+  11. Patents.
+  A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based.  The
+work thus licensed is called the contributor's "contributor version".
+  A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version.  For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+  Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+  In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement).  To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+  If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either (1) cause the Corresponding Source to be so
+available, or (2) arrange to deprive yourself of the benefit of the
+patent license for this particular work, or (3) arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients.  "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+  If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+  A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License.  You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license (a) in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or (b) primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+  Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+  12. No Surrender of Others' Freedom.
+  If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all.  For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+  13. Use with the GNU Affero General Public License.
+  Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU Affero General Public License into a single
+combined work, and to convey the resulting work.  The terms of this
+License will continue to apply to the part which is the covered work,
+but the special requirements of the GNU Affero General Public License,
+section 13, concerning interaction through a network will apply to the
+combination as such.
+  14. Revised Versions of this License.
+  The Free Software Foundation may publish revised and/or new versions of
+the GNU General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+  Each version is given a distinguishing version number.  If the
+Program specifies that a certain numbered version of the GNU General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation.  If the Program does not specify a version number of the
+GNU General Public License, you may choose any version ever published
+by the Free Software Foundation.
+  If the Program specifies that a proxy can decide which future
+versions of the GNU General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+  Later license versions may give you additional or different
+permissions.  However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+  15. Disclaimer of Warranty.
+  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+  16. Limitation of Liability.
+  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+  17. Interpretation of Sections 15 and 16.
+  If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+                     END OF TERMS AND CONDITIONS
+            How to Apply These Terms to Your New Programs
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 3 of the License, or
+    (at your option) any later version.
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <https://www.gnu.org/licenses/>.
+Also add information on how to contact you by electronic and paper mail.
+  If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+    <program>  Copyright (C) <year>  <name of author>
+    This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, your program's commands
+might be different; for a GUI interface, you would use an "about box".
+  You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU GPL, see
+<https://www.gnu.org/licenses/>.
+  The GNU General Public License does not permit incorporating your program
+into proprietary programs.  If your program is a subroutine library, you
+may consider it more useful to permit linking proprietary applications with
+the library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.  But first, please read
+<https://www.gnu.org/licenses/why-not-lgpl.html>.

README.md CHANGED Viewed

@@ -1,12 +1,248 @@
 ---
-title: Law Llama
-emoji: 🌍
-colorFrom: red
-colorTo: yellow
 sdk: gradio
-sdk_version: 3.40.1
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Law_llama
+app_file: webui.py
 sdk: gradio
+sdk_version: 3.39.0
 ---
+# LaWGPT：基于中文法律知识的大语言模型
+<p align="center">
+  <a href="assets/logo/lawgpt.jpeg">
+    <img src="./assets/logo/lawgpt.jpeg" width="80%" >
+  </a>
+</p>
+<p align="center">
+    <a href="https://github.com/pengxiao-song/LaWGPT/wiki"><img src="https://img.shields.io/badge/docs-Wiki-brightgreen"></a>
+    <a href="https://huggingface.co/entity303"><img src="https://img.shields.io/badge/Hugging%20Face-entity303-green"></a>
+    <a href=""><img src="https://img.shields.io/badge/version-beta1.1-blue"></a>
+    <a href=""><img src="https://img.shields.io/badge/os-Linux-9cf"></a>
+    <a href=""><img src="https://img.shields.io/github/last-commit/pengxiao-song/lawgpt"></a>
+    <a href="https://star-history.com/#pengxiao-song/LaWGPT&Timeline"><img src="https://img.shields.io/github/stars/pengxiao-song/lawgpt?color=yellow"></a>
+    <!-- <a href="https://www.lamda.nju.edu.cn/"><img src="https://img.shields.io/badge/support-NJU--LAMDA-9cf.svg"></a> -->
+</p>
+LaWGPT 是一系列基于中文法律知识的开源大语言模型。
+该系列模型在通用中文基座模型（如 Chinese-LLaMA、ChatGLM 等）的基础上扩充法律领域专有词表、**大规模中文法律语料预训练**，增强了大模型在法律领域的基础语义理解能力。在此基础上，**构造法律领域对话问答数据集、中国司法考试数据集进行指令精调**，提升了模型对法律内容的理解和执行能力。
+详细内容请参考[技术报告]()。
+---
+本项目持续开展，法律领域数据集及系列模型后续相继开源，敬请关注。
+## 更新
+- 🌟 2023/05/30：公开发布
+  <a href="https://huggingface.co/entity303/lawgpt-lora-7b-v2"><img src="https://img.shields.io/badge/Model-LaWGPT--7B--beta1.1-yellow"></a>
+  - **LaWGPT-7B-beta1.1**：法律对话模型，构造 35w 高质量法律问答数据集基于 Chinese-alpaca-plus-7B 指令精调
+- 📣 2023/05/26：开放 [Discussions 讨论区](https://github.com/pengxiao-song/LaWGPT/discussions)，欢迎朋友们交流探讨、提出意见、分享观点！
+- 🛠️ 2023/05/22：项目主分支结构调整，详见[项目结构](https://github.com/pengxiao-song/LaWGPT#项目结构)；支持[命令行批量推理](https://github.com/pengxiao-song/LaWGPT/blob/main/scripts/infer.sh)
+- 🪴 2023/05/15：发布 [中文法律数据源汇总（Awesome Chinese Legal Resources）](https://github.com/pengxiao-song/awesome-chinese-legal-resources) 和 [法律领域词表](https://github.com/pengxiao-song/LaWGPT/blob/main/resources/legal_vocab.txt)
+- 🌟 2023/05/13：公开发布
+  <a href="https://huggingface.co/entity303/legal-lora-7b"><img src="https://img.shields.io/badge/Model-Legal--Base--7B-blue"></a>
+  <a href="https://huggingface.co/entity303/lawgpt-legal-lora-7b"><img src="https://img.shields.io/badge/Model-LaWGPT--7B--beta1.0-yellow"></a>
+  - **Legal-Base-7B**：法律基座模型，使用 50w 中文裁判文书数据二次预训练
+  - **LaWGPT-7B-beta1.0**：法律对话模型，构造 30w 高质量法律问答数据集基于 Legal-Base-7B 指令精调
+- 🌟 2023/04/12：内部测试
+  <a href="https://huggingface.co/entity303/lawgpt-lora-7b"><img src="https://img.shields.io/badge/Model-Lawgpt--7B--alpha-yellow"></a>
+  - **LaWGPT-7B-alpha**：在 Chinese-LLaMA-7B 的基础上直接构造 30w 法律问答数据集指令精调
+## 快速开始
+1. 准备代码，创建环境
+   ```bash
+   # 下载代码
+   git clone [email protected]:pengxiao-song/LaWGPT.git
+   cd LaWGPT
+   # 创建环境
+   conda create -n lawgpt python=3.10 -y
+   conda activate lawgpt
+   pip install -r requirements.txt
+   ```
+2. **启动 web ui（可选，易于调节参数）**
+   - 首先，执行服务启动脚本：`bash scripts/webui.sh`
+   - 其次，访问 http://127.0.0.1:7860 ：
+   <p align="center">
+      <img style="border-radius: 50%; box-shadow: 0 0 10px rgba(0,0,0,0.5); width: 80%;", src="./assets/demo/example-03.jpeg">
+   </p>
+3. **命令行推理（可选，支持批量测试）**
+   - 首先，参考 `resources/example_infer_data.json` 文件内容构造测试样本集；
+   - 其次，执行推理脚本：`bash scripts/infer.sh`。其中 `--infer_data_path` 参数为测试样本集路径，如果为空或者路径出错，则以交互模式运行。
+注意，以上步骤的默认模型为 LaWGPT-7B-alpha ，如果您想使用 LaWGPT-7B-beta1.0 模型：
+- 由于 [LLaMA](https://github.com/facebookresearch/llama) 和 [Chinese-LLaMA](https://github.com/ymcui/Chinese-LLaMA-Alpaca) 均未开源模型权重。根据相应开源许可，**本项目只能发布 LoRA 权重**，无法发布完整的模型权重，请各位谅��。
+- 本项目给出[合并方式](https://github.com/pengxiao-song/LaWGPT/wiki/%E6%A8%A1%E5%9E%8B%E5%90%88%E5%B9%B6)，请各位获取原版权重后自行重构模型。
+## 项目结构
+```bash
+LaWGPT
+├── assets    # 静态资源
+├── resources # 项目资源
+├── models    # 基座模型及 lora 权重
+│   ├── base_models
+│   └── lora_weights
+├── outputs   # 指令微调的输出权重
+├── data      # 实验数据
+├── scripts   # 脚本目录
+│   ├── finetune.sh # 指令微调脚本
+│   └── webui.sh    # 启动服务脚本
+├── templates # prompt 模板
+├── tools     # 工具包
+├── utils
+├── train_clm.py  # 二次训练
+├── finetune.py   # 指令微调
+├── webui.py      # 启动服务
+├── README.md
+└── requirements.txt
+```
+## 数据构建
+本项目基于中文裁判文书网公开法律文书数据、司法考试数据等数据集展开，详情参考[中文法律数据源汇总（Awesome Chinese Legal Resources）](https://github.com/pengxiao-song/awesome-chinese-legal-resources)。
+1. 初级数据生成：根据 [Stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca#data-generation-process) 和 [self-instruct](https://github.com/yizhongw/self-instruct) 方式生成对话问答数据
+2. 知识引导的数据生成：通过 Knowledge-based Self-Instruct 方式基于中文法律结构化知识生成数据。
+3. 引入 ChatGPT 清洗数据，辅助构造高质量数据集。
+## 模型训练
+LawGPT 系列模型的训练过程分为两个阶段：
+1.  第一阶段：扩充法律领域词表，在大规模法律文书及法典数据上预训练 Chinese-LLaMA
+2.  第二阶段：构造法律领域对话问答数据集，在预训练模型基础上指令精调
+### 二次训练流程
+1. 参考 `resources/example_instruction_train.json` 构造二次训练数据集
+2. 运行 `scripts/train_clm.sh`
+### 指令精调步骤
+1. 参考 `resources/example_instruction_tune.json` 构造指令微调数据集
+2. 运行 `scripts/finetune.sh`
+### 计算资源
+8 张 Tesla V100-SXM2-32GB ：二次训练阶段耗时约 24h / epoch，微调阶段耗时约 12h / epoch
+## 模型评估
+### 输出示例
+<details><summary>问题：酒驾撞人怎么判刑？</summary>
+![](assets/demo/demo07.jpeg)
+</details>
+<details><summary>问题：请给出判决意见。</summary>
+![](assets/demo/example-05.jpeg)
+</details>
+<details><summary>问题：请介绍赌博罪的定义。</summary>
+![](assets/demo/example-06.jpeg)
+</details>
+<details><summary>问题：请问加班工资怎么算？</summary>
+![](assets/demo/example-04.jpeg)
+</details>
+<details><summary>问题：民间借贷受国家保护的合法利息是多少?</summary>
+![](assets/demo/example-02.jpeg)
+</details>
+<details><summary>问题：欠了信用卡的钱还不上要坐牢吗？</summary>
+![](assets/demo/example-01.jpeg)
+</details>
+<details><summary>问题：你能否写一段抢劫罪罪名的案情描述？</summary>
+![](assets/demo/example-03.jpeg)
+</details>
+### 局限性
+由于计算资源、数据规模等因素限制，当前阶段 LawGPT 存在诸多局限性：
+1. 数据资源有限、模型容量较小，导致其相对较弱的模型记忆和语言能力。因此，在面对事实性知识任务时，可能会生成不正确的结果。
+2. 该系列模型只进行了初步的人类意图对齐。因此，可能产生不可预测的有害内容以及不符合人类偏好和价值观的内容。
+3. 自我认知能力存在问题，中文理解能力有待增强。
+请诸君在使用前了解上述问题，以免造成误解和不必要的麻烦。
+## 协作者
+如下各位合作开展（按字母序排列）：[@cainiao](https://github.com/herobrine19)、[@njuyxw](https://github.com/njuyxw)、[@pengxiao-song](https://github.com/pengxiao-song)
+## 免责声明
+请各位严格遵守如下约定：
+1. 本项目任何资源**仅供学术研究使用，严禁任何商业用途**。
+2. 模型输出受多种不确定性因素影响，本项目当前无法保证其准确性，**严禁用于真实法律场景**。
+3. 本项目不承担任何法律责任，亦不对因使用相关资源和输出结果而可能产生的任何损失承担责任。
+## 问题反馈
+如有问题，请在 GitHub Issue 中提交。
+- 提交问题之前，建议查阅 FAQ 及以往的 issue 看是否能解决您的问题。
+- 请礼貌讨论，构建和谐社区。
+协作者科研之余推进项目进展，由于人力有限难以实时反馈，给诸君带来不便，敬请谅解！
+## 致谢
+本项目基于如下开源项目展开，在此对相关项目和开发人员表示诚挚的感谢：
+- Chinese-LLaMA-Alpaca: https://github.com/ymcui/Chinese-LLaMA-Alpaca
+- LLaMA: https://github.com/facebookresearch/llama
+- Alpaca: https://github.com/tatsu-lab/stanford_alpaca
+- alpaca-lora: https://github.com/tloen/alpaca-lora
+- ChatGLM-6B: https://github.com/THUDM/ChatGLM-6B
+此外，本项目基于开放数据资源，详见 [Awesome Chinese Legal Resources](https://github.com/pengxiao-song/awesome-chinese-legal-resources)，一并表示感谢。
+## 引用
+如果您觉得我们的工作对您有所帮助，请考虑引用该项目

assets/demo/demo.png ADDED Viewed

assets/demo/demo07.jpeg ADDED Viewed

assets/demo/example-01.jpeg ADDED Viewed

assets/demo/example-02.jpeg ADDED Viewed

assets/demo/example-03.jpeg ADDED Viewed

assets/demo/example-04.jpeg ADDED Viewed

assets/demo/example-05.jpeg ADDED Viewed

assets/demo/example-06.jpeg ADDED Viewed

assets/logo/lamda.png ADDED Viewed

assets/logo/lawgpt.jpeg ADDED Viewed

data/.gitkeep ADDED Viewed

File without changes

finetune.py ADDED Viewed

	@@ -0,0 +1,283 @@

+import os
+import sys
+from typing import List
+import fire
+import torch
+import transformers
+from datasets import load_dataset
+"""
+Unused imports:
+import torch.nn as nn
+import bitsandbytes as bnb
+"""
+from peft import (
+    LoraConfig,
+    get_peft_model,
+    get_peft_model_state_dict,
+    prepare_model_for_int8_training,
+    set_peft_model_state_dict,
+)
+from transformers import LlamaForCausalLM, LlamaTokenizer
+from utils.prompter import Prompter
+def train(
+    # model/data params
+    base_model: str = "",  # the only required argument
+    data_path: str = "yahma/alpaca-cleaned",
+    output_dir: str = "./lora-alpaca",
+    # training hyperparams
+    batch_size: int = 128,
+    micro_batch_size: int = 4,
+    num_epochs: int = 3,
+    learning_rate: float = 3e-4,
+    cutoff_len: int = 256,
+    val_set_size: int = 2000,
+    # lora hyperparams
+    lora_r: int = 8,
+    lora_alpha: int = 16,
+    lora_dropout: float = 0.05,
+    lora_target_modules: List[str] = [
+        "q_proj",
+        "v_proj",
+    ],
+    # llm hyperparams
+    train_on_inputs: bool = True,  # if False, masks out inputs in loss
+    add_eos_token: bool = True,
+    group_by_length: bool = False,  # faster, but produces an odd training loss curve
+    # wandb params
+    wandb_project: str = "",
+    wandb_run_name: str = "",
+    wandb_watch: str = "",  # options: false | gradients | all
+    wandb_log_model: str = "",  # options: false | true
+    resume_from_checkpoint: str = None,  # either training checkpoint or final adapter
+    prompt_template_name: str = "alpaca",  # The prompt template to use, will default to alpaca.
+):
+    if int(os.environ.get("LOCAL_RANK", 0)) == 0:
+        print(
+            f"Training Alpaca-LoRA model with params:\n"
+            f"base_model: {base_model}\n"
+            f"data_path: {data_path}\n"
+            f"output_dir: {output_dir}\n"
+            f"batch_size: {batch_size}\n"
+            f"micro_batch_size: {micro_batch_size}\n"
+            f"num_epochs: {num_epochs}\n"
+            f"learning_rate: {learning_rate}\n"
+            f"cutoff_len: {cutoff_len}\n"
+            f"val_set_size: {val_set_size}\n"
+            f"lora_r: {lora_r}\n"
+            f"lora_alpha: {lora_alpha}\n"
+            f"lora_dropout: {lora_dropout}\n"
+            f"lora_target_modules: {lora_target_modules}\n"
+            f"train_on_inputs: {train_on_inputs}\n"
+            f"add_eos_token: {add_eos_token}\n"
+            f"group_by_length: {group_by_length}\n"
+            f"wandb_project: {wandb_project}\n"
+            f"wandb_run_name: {wandb_run_name}\n"
+            f"wandb_watch: {wandb_watch}\n"
+            f"wandb_log_model: {wandb_log_model}\n"
+            f"resume_from_checkpoint: {resume_from_checkpoint or False}\n"
+            f"prompt template: {prompt_template_name}\n"
+        )
+    assert (
+        base_model
+    ), "Please specify a --base_model, e.g. --base_model='huggyllama/llama-7b'"
+    gradient_accumulation_steps = batch_size // micro_batch_size
+    prompter = Prompter(prompt_template_name)
+    device_map = "auto"
+    world_size = int(os.environ.get("WORLD_SIZE", 1))
+    ddp = world_size != 1
+    if ddp:
+        device_map = {"": int(os.environ.get("LOCAL_RANK") or 0)}
+        gradient_accumulation_steps = gradient_accumulation_steps // world_size
+    # Check if parameter passed or if set within environ
+    use_wandb = len(wandb_project) > 0 or (
+        "WANDB_PROJECT" in os.environ and len(os.environ["WANDB_PROJECT"]) > 0
+    )
+    # Only overwrite environ if wandb param passed
+    if len(wandb_project) > 0:
+        os.environ["WANDB_PROJECT"] = wandb_project
+    if len(wandb_watch) > 0:
+        os.environ["WANDB_WATCH"] = wandb_watch
+    if len(wandb_log_model) > 0:
+        os.environ["WANDB_LOG_MODEL"] = wandb_log_model
+    model = LlamaForCausalLM.from_pretrained(
+        base_model,
+        load_in_8bit=True,
+        torch_dtype=torch.float16,
+        device_map=device_map,
+    )
+    tokenizer = LlamaTokenizer.from_pretrained(base_model)
+    tokenizer.pad_token_id = (
+        0  # unk. we want this to be different from the eos token
+    )
+    tokenizer.padding_side = "left"  # Allow batched inference
+    def tokenize(prompt, add_eos_token=True):
+        # there's probably a way to do this with the tokenizer settings
+        # but again, gotta move fast
+        result = tokenizer(
+            prompt,
+            truncation=True,
+            max_length=cutoff_len,
+            padding=False,
+            return_tensors=None,
+        )
+        if (
+            result["input_ids"][-1] != tokenizer.eos_token_id
+            and len(result["input_ids"]) < cutoff_len
+            and add_eos_token
+        ):
+            result["input_ids"].append(tokenizer.eos_token_id)
+            result["attention_mask"].append(1)
+        result["labels"] = result["input_ids"].copy()
+        return result
+    def generate_and_tokenize_prompt(data_point):
+        full_prompt = prompter.generate_prompt(
+            data_point["instruction"],
+            data_point["input"],
+            data_point["output"],
+        )
+        tokenized_full_prompt = tokenize(full_prompt)
+        if not train_on_inputs:
+            user_prompt = prompter.generate_prompt(
+                data_point["instruction"], data_point["input"]
+            )
+            tokenized_user_prompt = tokenize(
+                user_prompt, add_eos_token=add_eos_token
+            )
+            user_prompt_len = len(tokenized_user_prompt["input_ids"])
+            if add_eos_token:
+                user_prompt_len -= 1
+            tokenized_full_prompt["labels"] = [
+                -100
+            ] * user_prompt_len + tokenized_full_prompt["labels"][
+                user_prompt_len:
+            ]  # could be sped up, probably
+        return tokenized_full_prompt
+    model = prepare_model_for_int8_training(model)
+    config = LoraConfig(
+        r=lora_r,
+        lora_alpha=lora_alpha,
+        target_modules=lora_target_modules,
+        lora_dropout=lora_dropout,
+        bias="none",
+        task_type="CAUSAL_LM",
+    )
+    model = get_peft_model(model, config)
+    if data_path.endswith(".json") or data_path.endswith(".jsonl"):
+        data = load_dataset("json", data_files=data_path)
+    else:
+        data = load_dataset(data_path)
+    if resume_from_checkpoint:
+        # Check the available weights and load them
+        checkpoint_name = os.path.join(
+            resume_from_checkpoint, "pytorch_model.bin"
+        )  # Full checkpoint
+        if not os.path.exists(checkpoint_name):
+            checkpoint_name = os.path.join(
+                resume_from_checkpoint, "adapter_model.bin"
+            )  # only LoRA model - LoRA config above has to fit
+            resume_from_checkpoint = (
+                False  # So the trainer won't try loading its state
+            )
+        # The two files above have a different name depending on how they were saved, but are actually the same.
+        if os.path.exists(checkpoint_name):
+            print(f"Restarting from {checkpoint_name}")
+            adapters_weights = torch.load(checkpoint_name)
+            set_peft_model_state_dict(model, adapters_weights)
+        else:
+            print(f"Checkpoint {checkpoint_name} not found")
+    model.print_trainable_parameters()  # Be more transparent about the % of trainable params.
+    if val_set_size > 0:
+        train_val = data["train"].train_test_split(
+            test_size=val_set_size, shuffle=True, seed=42
+        )
+        train_data = (
+            train_val["train"].shuffle().map(generate_and_tokenize_prompt)
+        )
+        val_data = (
+            train_val["test"].shuffle().map(generate_and_tokenize_prompt)
+        )
+    else:
+        train_data = data["train"].shuffle().map(generate_and_tokenize_prompt)
+        val_data = None
+    if not ddp and torch.cuda.device_count() > 1:
+        # keeps Trainer from trying its own DataParallelism when more than 1 gpu is available
+        model.is_parallelizable = True
+        model.model_parallel = True
+    trainer = transformers.Trainer(
+        model=model,
+        train_dataset=train_data,
+        eval_dataset=val_data,
+        args=transformers.TrainingArguments(
+            per_device_train_batch_size=micro_batch_size,
+            gradient_accumulation_steps=gradient_accumulation_steps,
+            warmup_ratio=0.1,
+            num_train_epochs=num_epochs,
+            learning_rate=learning_rate,
+            fp16=True,
+            logging_steps=10,
+            optim="adamw_torch",
+            evaluation_strategy="steps" if val_set_size > 0 else "no",
+            save_strategy="steps",
+            eval_steps=50 if val_set_size > 0 else None,
+            save_steps=50,
+            output_dir=output_dir,
+            save_total_limit=5,
+            load_best_model_at_end=True if val_set_size > 0 else False,
+            ddp_find_unused_parameters=False if ddp else None,
+            group_by_length=group_by_length,
+            report_to="wandb" if use_wandb else None,
+            run_name=wandb_run_name if use_wandb else None,
+        ),
+        data_collator=transformers.DataCollatorForSeq2Seq(
+            tokenizer, pad_to_multiple_of=8, return_tensors="pt", padding=True
+        ),
+    )
+    model.config.use_cache = False
+    old_state_dict = model.state_dict
+    model.state_dict = (
+        lambda self, *_, **__: get_peft_model_state_dict(
+            self, old_state_dict()
+        )
+    ).__get__(model, type(model))
+    if torch.__version__ >= "2" and sys.platform != "win32":
+        model = torch.compile(model)
+    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
+    model.save_pretrained(output_dir)
+    print(
+        "\n If there's a warning about missing keys above, please disregard :)"
+    )
+if __name__ == "__main__":
+    fire.Fire(train)

infer.py ADDED Viewed

	@@ -0,0 +1,136 @@

+import sys
+import json
+import fire
+import torch
+from peft import PeftModel
+from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
+from utils.prompter import Prompter
+if torch.cuda.is_available():
+    device = "cuda"
+class Infer():
+    def __init__(
+        self,
+        load_8bit: bool = False,
+        base_model: str = "",
+        lora_weights: str = "",
+        prompt_template: str = "",  # The prompt template to use, will default to alpaca.
+    ):
+        prompter = Prompter(prompt_template)
+        tokenizer = LlamaTokenizer.from_pretrained(base_model)
+        model = LlamaForCausalLM.from_pretrained(
+            base_model,
+            load_in_8bit=load_8bit,
+            torch_dtype=torch.float16,
+            device_map="auto",
+        )
+        try:
+            print(f"Using lora {lora_weights}")
+            model = PeftModel.from_pretrained(
+                model,
+                lora_weights,
+                torch_dtype=torch.float16,
+            )
+        except:
+            print("*"*50, "\n Attention! No Lora Weights \n", "*"*50)
+        # unwind broken decapoda-research config
+        model.config.pad_token_id = tokenizer.pad_token_id = 0  # unk
+        model.config.bos_token_id = 1
+        model.config.eos_token_id = 2
+        if not load_8bit:
+            model.half()  # seems to fix bugs for some users.
+        model.eval()
+        if torch.__version__ >= "2" and sys.platform != "win32":
+            model = torch.compile(model)
+        self.base_model = base_model
+        self.lora_weights = lora_weights
+        self.model = model
+        self.prompter = prompter
+        self.tokenizer = tokenizer
+    def generate_output(
+        self,
+        instruction,
+        input=None,
+        temperature=0.1,
+        top_p=0.75,
+        top_k=40,
+        num_beams=1,
+        max_new_tokens=256,
+        **kwargs,
+    ):
+        prompt = self.prompter.generate_prompt(instruction, input)
+        inputs = self.tokenizer(prompt, return_tensors="pt")
+        input_ids = inputs["input_ids"].to(device)
+        generation_config = GenerationConfig(
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            num_beams=num_beams,
+            # repetition_penalty=10.0,
+            **kwargs,
+        )
+        with torch.no_grad():
+            generation_output = self.model.generate(
+                input_ids=input_ids,
+                generation_config=generation_config,
+                return_dict_in_generate=True,
+                output_scores=True,
+                max_new_tokens=max_new_tokens,
+            )
+        s = generation_output.sequences[0]
+        output = self.tokenizer.decode(s)
+        return self.prompter.get_response(output)
+    def infer_from_file(self, infer_data_path):
+        with open(infer_data_path) as f:
+            for line in f:
+                data = json.loads(line)
+                instruction = data["instruction"]
+                output = data["output"]
+                print('=' * 100)
+                print(f"Base Model: {self.base_model}    Lora Weights: {self.lora_weights}")
+                print("Instruction:\n", instruction)
+                model_output = self.generate_output(instruction)
+                print("Model Output:\n", model_output)
+                print("Ground Truth:\n", output)
+                print('=' * 100)
+def main(
+    load_8bit: bool = False,
+    base_model: str = "",
+    lora_weights: str = "",
+    prompt_template: str = "",  # The prompt template to use, will default to alpaca.
+    infer_data_path: str = "",
+):
+    infer = Infer(
+        load_8bit=load_8bit,
+        base_model=base_model,
+        lora_weights=lora_weights,
+        prompt_template=prompt_template
+    )
+    try:
+        infer.infer_from_file(infer_data_path)
+    except Exception as e:
+        print(e, "Read infer_data_path Failed! Now Interactive Mode: ")
+        while True:
+            print('=' * 100)
+            instruction = input("请输入您的问题: ")
+            print("LaWGPT:")
+            print(infer.generate_output(instruction))
+            print('=' * 100)
+if __name__ == "__main__":
+    fire.Fire(main)

merge.py ADDED Viewed

	@@ -0,0 +1,74 @@

+import os
+import torch
+import transformers
+from peft import PeftModel
+from transformers import LlamaForCausalLM, LlamaTokenizer  # noqa: F402
+import argparse
+parser = argparse.ArgumentParser(description='Merge Base Model and Lora')
+parser.add_argument('--base_model', type=str, default="minlik/chinese-llama-7b-merged", help='base model path')
+parser.add_argument('--lora_model', type=str, default="entity303/legal-lora-7b", help='lora model path')
+parser.add_argument('--output_dir', type=str, default="./models/base_models/llama-7b-legal-lora-merged", help='output model path')
+args = parser.parse_args()
+BASE_MODEL = args.base_model
+LORA_MODEL = args.lora_model
+OUTPUT_DIR = args.output_dir
+assert (
+    BASE_MODEL
+), "Please specify a value for BASE_MODEL environment variable, e.g. `export BASE_MODEL=huggyllama/llama-7b`"  # noqa: E501
+print(f"{'*'*20} Using base model: {BASE_MODEL} {'*'*20}")
+print(f"{'*'*20} Using lora model: {LORA_MODEL} {'*'*20}")
+print(f"{'*'*20} Saving to: {OUTPUT_DIR} {'*'*20}")
+tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL)
+base_model = LlamaForCausalLM.from_pretrained(
+    BASE_MODEL,
+    load_in_8bit=False,
+    torch_dtype=torch.float16,
+    device_map={"": "cpu"},
+)
+first_weight = base_model.model.layers[0].self_attn.q_proj.weight
+first_weight_old = first_weight.clone()
+lora_model = PeftModel.from_pretrained(
+    base_model,
+    LORA_MODEL,
+    device_map={"": "cpu"},
+    torch_dtype=torch.float16,
+)
+lora_weight = lora_model.base_model.model.model.layers[
+    0
+].self_attn.q_proj.weight
+assert torch.allclose(first_weight_old, first_weight)
+# merge weights - new merging method from peft
+lora_model = lora_model.merge_and_unload()
+lora_model.train(False)
+# did we do anything?
+assert not torch.allclose(first_weight_old, first_weight)
+lora_model_sd = lora_model.state_dict()
+deloreanized_sd = {
+    k.replace("base_model.model.", ""): v
+    for k, v in lora_model_sd.items()
+    if "lora" not in k
+}
+LlamaForCausalLM.save_pretrained(
+    base_model, OUTPUT_DIR, state_dict=deloreanized_sd, max_shard_size="2048MB"
+)
+LlamaTokenizer.save_pretrained(tokenizer, OUTPUT_DIR)

models/base_models/.gitkeep ADDED Viewed

File without changes

models/lora_weights/.gitkeep ADDED Viewed

File without changes

outputs/.gitkeep ADDED Viewed

File without changes

requirements.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+accelerate
+appdirs
+bitsandbytes
+black
+black[jupyter]
+datasets
+fire
+git+https://github.com/huggingface/peft.git@e536616888d51b453ed354a6f1e243fecb02ea08
+git+https://github.com/huggingface/transformers.git
+gradio
+sentencepiece
+wandb
+scipy
+socksio

resources/criminal_charges.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resources/example_infer_data.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{"instruction":"请介绍赌博罪的定义。","input":"","output":"无"}
+{"instruction":"请问加班工资怎么算？","input":"","output":"无"}
+{"instruction":"民间借贷受国家保护的合法利息是多少?","input":"","output":"无"}
+{"instruction":"欠了信用卡的钱还不上要坐牢吗？","input":"","output":"无"}
+{"instruction":"你能否写一段抢劫罪罪名的案情描述？","input":"","output":"无"}

resources/example_instruction_train.json ADDED Viewed

	@@ -0,0 +1,8 @@

+[
+    {
+      "content": "中华人民共和国最高人民法院 再 审 决 定 书（2022）最高法刑申136号 原审被告人张某某犯挪用资金罪和伪造、变造国家机关公文罪一案，山西省运城市盐湖区人民法院于2012年5月2日以（2012）运盐刑初字第69号刑事判决，认定张克云犯贪污罪，判处有期徒刑十二年，犯伪造、变造国家机关公文罪，判处有期徒刑三年，决定执行有期徒刑十三年。宣判后，张克云不服，提出上诉。山西省运城市中级人民法院于2012年11月12日以（2012）运中刑二终字第125号刑事裁定，驳回上诉，维持原判。裁判生效后，张克云不服，提出申诉。运城市中级人民法院于2013年1月7日以（2013）运中刑申字第3号驳回申诉通知，驳回其申诉。山西省高级人民法院于2017年7月13日以（2013）晋刑监字第8号再审决定，提审本案，并于2019年12月24日以（2017）晋刑再第2号刑事判决，认定张克云犯挪用资金罪，判处有期徒刑七年六个月，与原判伪造、变造国家机关公文罪被判处的有期徒刑三年数罪并罚，决定执行有期徒刑十年。张克云仍不服，以原审认定事实错误，其作为学校董事长、全资投资人有权决定学校相关款项用途，学校仍欠其债务，个人账户用于学校经费开支，没有挪用资金的动机和行为，不构成挪用资金罪等为由，向本院提出申诉。本院经审查认为，原审生效裁判对挪用资金罪定罪量刑的证据不确实、不充分，依法应当予以排除。依照《中华人民共和国刑事诉讼法》第二百五十三条第二项、第二百五十四条第二款、第二百五十五条的规定，决定如下：指令河南省高级人民法院对本案进行再审。二〇二二年十二月二十九日"
+    },
+    {
+      "content":"中华人民共和国最高人民法院 驳 回 申 诉 通 知 书（2022）最高法刑申122号 袁某银、袁某财：你们因原审被告人袁德银故意伤害一案，对江苏省南京市溧水区人民法院（2014）溧刑初字第268号刑事判决、南京市中级人民法院（2015）宁刑终字第433号刑事裁定不服，以被害人朱宽荣住院期间的ＣＴ（136678号）报告并未显示其左侧4、5、6、7、8肋骨骨折，出院记录及137470号、143006号ＣＴ报告均系伪造，江苏省高级人民法院（2019）苏刑申172号驳回申诉通知书对137470号ＣＴ报告的形成时间认定错误为由，向本院提出申诉，请求撤销原判，依法重新审理本案。本院依法组成合议庭认真审查后认为，原审认定原审被告人袁德银因邻里纠纷，殴打被害人朱宽荣致其左胸多发肋骨骨折，构成轻伤二级，其行为构成故意伤害罪，并无不当。关于你们提出的原审认定被害人朱宽荣轻伤二级的证据系伪造的申诉理由。首先，根据你们提供的136678号ＣＴ报告，朱宽荣于2015年2月12日入院时经ＣＴ检查被诊断为左侧多发肋骨骨折，该份ＣＴ报告虽未明确骨折的具体位置，但与出院记录记载的“左侧多发肋骨骨折（4、5、6、7、8）”并不矛盾。根据《人体损伤程度鉴定标准》的规定，肋骨骨折2处以上即构成轻伤二级，故鉴定意见认定朱宽荣的损伤构成轻伤二级并无不当。其次，你们主张出院记录、137470号和143006号ＣＴ报告系伪造，但并未提供证据证实。即使排除这两份ＣＴ报告，也不影响对本案事实的认定。再次，我院对申诉进行复查的对象是已经发生法律效力的判决、裁定，本案的生效裁判是（2014）溧刑初字第268号刑事判决和（2015）宁刑终字第433号刑事裁定，江苏省高级人民法院作出的驳回申诉通知书不属于我院复查的范围。综上，本院认为，你们的申诉不符合《中华人民共和国刑事诉讼法》第二百五十三条规定的应当重新审判的情形，予以驳回。望你们尊重人民法院的生效裁判，自觉服判息诉。特此通知。二〇二二年十二月十四日"
+    }
+]

resources/example_instruction_tune.json ADDED Viewed

	@@ -0,0 +1,12 @@

+[
+  {
+    "instruction": "酒驾撞人要判多久？",
+    "input": "",
+    "output": "《刑法》第一百三十三条规定:违反交通运输管理法规，因而发生重大事故，致人重伤、死广或者使公私财产遭受重大损失的，处三年以下有期徒刑或者拘役;交通运输肇事后逃逸，致人重伤的，处三年以上七年以下有期徒刑。交通肇事后逃逸致人死亡的，处七年以上有期徒刑。"
+  },
+  {
+    "instruction": "偷狗被派出所抓到，会怎么处理？",
+    "input": "",
+    "output": "取决于偷盗的狗的价值，可能按盗窃罪的罪名处理。如果价值达到犯罪标准，就构成盗窃罪，要承担刑事责任；如果不到犯罪标准，就是治安处罚、罚款或者拘留治安处罚这会涉嫌构成盗窃。如果不到一千元，则不会构成犯罪。如果超过一千元，则可能会是构成犯罪的。"
+  }
+]

resources/legal_vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

scripts/finetune.sh ADDED Viewed

	@@ -0,0 +1,56 @@

+#!/bin/bash
+export WANDB_MODE=disabled # 禁用wandb
+# 使用chinese-alpaca-plus-7b-merged模型在law_data.json数据集上finetune
+experiment_name="chinese-alpaca-plus-7b-law-e1"
+# 单卡或者模型并行
+python finetune.py \
+    --base_model "minlik/chinese-alpaca-plus-7b-merged" \
+    --data_path "./data/finetune_law_data.json" \
+    --output_dir "./outputs/"${experiment_name} \
+    --batch_size 64 \
+    --micro_batch_size 8 \
+    --num_epochs 20 \
+    --learning_rate 3e-4 \
+    --cutoff_len 256 \
+    --val_set_size 0 \
+    --lora_r 8 \
+    --lora_alpha 16 \
+    --lora_dropout 0.05 \
+    --lora_target_modules "[q_proj,v_proj]" \
+    --train_on_inputs False \
+    --add_eos_token True \
+    --group_by_length False \
+    --wandb_project "" \
+    --wandb_run_name "" \
+    --wandb_watch "" \
+    --wandb_log_model "" \
+    --resume_from_checkpoint "./outputs/"${experiment_name} \
+    --prompt_template_name "alpaca" \
+# 多卡数据并行
+# WORLD_SIZE=8 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1234 finetune.py \
+#     --base_model "minlik/chinese-alpaca-plus-7b-merged" \
+#     --data_path "./data/finetune_law_data.json" \
+#     --output_dir "./outputs/"${experiment_name} \
+#     --batch_size 64 \
+#     --micro_batch_size 8 \
+#     --num_epochs 20 \
+#     --learning_rate 3e-4 \
+#     --cutoff_len 256 \
+#     --val_set_size 0 \
+#     --lora_r 8 \
+#     --lora_alpha 16 \
+#     --lora_dropout 0.05 \
+#     --lora_target_modules "[q_proj,v_proj]" \
+#     --train_on_inputs True \
+#     --add_eos_token True \
+#     --group_by_length False \
+#     --wandb_project \
+#     --wandb_run_name \
+#     --wandb_watch \
+#     --wandb_log_model \
+#     --resume_from_checkpoint "./outputs/"${experiment_name} \
+#     --prompt_template_name "alpaca" \

scripts/infer.sh ADDED Viewed

	@@ -0,0 +1,7 @@

+python infer.py \
+    --load_8bit True \
+    --base_model 'minlik/chinese-llama-7b-merged' \
+    --lora_weights 'entity303/lawgpt-lora-7b' \
+    --prompt_template 'law_template' \
+    --infer_data_path './resources/example_infer_data.json'

scripts/merge.sh ADDED Viewed

	@@ -0,0 +1,4 @@

+python merge.py \
+    --base_model 'minlik/chinese-llama-7b-merged' \
+    --lora_model 'entity303/legal-lora-7b' \
+    --output_dir './models/base_models/legal_base-7b' \

scripts/train_clm.sh ADDED Viewed

	@@ -0,0 +1,20 @@

+#!/bin/bash
+WORLD_SIZE=8 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1235 train_clm.py \
+    --base_model './models/base_models/chinese_llama_7b' \
+    --data_path './data/train_clm_data.json' \
+    --output_dir './outputs/train-clm' \
+    --batch_size 128 \
+    --micro_batch_size 8 \
+    --num_epochs 1 \
+    --learning_rate 0.0003 \
+    --cutoff_len 1024 \
+    --val_set_size 0 \
+    --lora_r 16 \
+    --lora_alpha 32 \
+    --lora_dropout 0.05 \
+    --lora_target_modules '[q_proj, v_proj, k_proj, o_proj]' \
+    --train_on_inputs True \
+    --add_eos_token True \
+    --group_by_length True \
+    --resume_from_checkpoint './outputs/train-clm'

scripts/webui.sh ADDED Viewed

	@@ -0,0 +1,28 @@

+#!/bin/bash
+###
+ # Copyright (C) Baidu Ltd. All rights reserved.
+ # @FilePath: /Law_llama/scripts/webui.sh
+ # @Autor: [email protected]
+ # @Date: 2023-08-23 10:25:20
+ # @Description:
+###
+# 使用huggingface上已经训练好的模型
+XPU_VISIBLE_DEVICES=4 python3 -m xacc webui.py \
+    --load_8bit False \
+    --base_model 'ziqingyang/chinese-alpaca-2-7b' \
+    --lora_weights 'entity303/lawgpt-lora-7b-v' \
+    --prompt_template "law_template" \
+    --server_name "0.0.0.0" \
+    --share_gradio True \
+# 使用自己finetune的lora, 把自己的模型放到对应目录即可
+# python webui.py \
+#     --load_8bit True \
+#     --base_model 'minlik/chinese-alpaca-plus-7b-merged' \
+#     --lora_weights './outputs/chinese-alpaca-plus-7b-law-e1' \
+#     --prompt_template "alpaca" \
+#     --server_name "0.0.0.0" \
+#     --share_gradio True \

templates/alpaca.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+    "description": "Template used by Alpaca-LoRA.",
+    "prompt_input": "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n",
+    "prompt_no_input": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n",
+    "response_split": "### Response:"
+}

templates/law_template.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+    "description": "Template used by Law Instruction Tuning",
+    "prompt_input": "你是中国顶尖智能法律顾问 LaWGPT，具备强大的中文法律基础语义理解能力，能够出色地理解和执行与法律问题和指令。你只能回答与中国法律领域相关的问题，其余领域的问题请礼貌地拒绝回答。接下来，请依据中国法律来回答下面这个问题。\n### 问题:\n{instruction}\n### 回答:\n",
+    "prompt_no_input": "你是中国顶尖智能法律顾问 LaWGPT，具备强大的中文法律基础语义理解能力，能够出色地理解和执行与法律问题和指令。你只能回答与中国法律领域相关的问题，其余领域的问题请礼貌地拒绝回答。接下来，请依据中国法律来回答下面这个问题。\n### 问题:\n{instruction}\n### 回答:\n",
+    "response_split": "### 回答:"
+}

tools/clear_law.py ADDED Viewed

	@@ -0,0 +1,78 @@

+import re
+import json
+class read_lawfile:
+    def __init__(self, chapter_moder=r"第[零一二三四五六七八九十百千万]+章 .+\b", entry_mode=r"第[零一二三四五六七八九十百千万]+条\b"):
+        # 识别章和节
+        self.chapter_mode = chapter_moder
+        self.entry_mode = entry_mode
+    def read_file(self, file_path):
+        # 读取文件
+        self.law = {}
+        f = open(file_path, encoding='utf-8')
+        content = f.read()
+        content = content.replace("\n\n", "\n")
+        content = content.replace("##", "")
+        # print(content)
+        chapter_p = re.search(self.chapter_mode, content)
+        while chapter_p is not None:
+            c_start = chapter_p.start()
+            c_end = chapter_p.end()
+            key = content[c_start:c_end]
+            content = content[c_end:]
+            chapter_p = re.search(self.chapter_mode, content)
+            if chapter_p is not None:
+                end = chapter_p.start()
+                c_content = content[:end]
+                self.law[key] = self.read_entrys(c_content)
+            # print(content[c_start:c_end])
+            else:
+                self.law[key] = self.read_entrys(content)
+        f.close()
+        return self.law
+    def read_entrys(self, content):
+        entrys = {}
+        entry_p = re.search(self.entry_mode, content)
+        while entry_p is not None:
+            e_start = entry_p.start()
+            e_end = entry_p.end()
+            key = content[e_start:e_end]
+            content = content[e_end+1:]
+            entry_p = re.search(self.entry_mode, content)
+            if entry_p is not None:
+                end = entry_p.start()
+                e_content = content[:end]
+                entrys[key] = e_content
+            else:
+                entrys[key] = content
+        return entrys
+    # entry_p = re.search(entry_mode, content)
+    # while entry_p is not None:
+    #     start = entry_p.start()
+    #     end = entry_p.end()
+    #     # print(content[start:end])
+    #     content = content[end:]
+    #     law[content[start:end]] = read_entrys(content)
+    #     chapter_p = re.search(chapter_mode, content)
+    def show(self):
+        for key in self.law:
+            print(key, '\n')
+            for item in self.law[key]:
+                print(item, ' ', self.law[key][item])
+if __name__ == '__main__':
+    file_path = "D:/11496/Documents/project/Laws-master/经济法/价格法(1997-12-29).md"
+    r = read_lawfile()
+    dict = r.read_file(file_path)
+    r.show()
+    print(dict)
+    with open('./a.json', 'w') as f:
+        # json.dumps(dict, f, ensure_ascii=False)
+        json.dump(dict, f, ensure_ascii=False)

tools/merge_vocabulary.py ADDED Viewed

	@@ -0,0 +1,63 @@

+from transformers import LlamaTokenizer
+from sentencepiece import sentencepiece_model_pb2 as model
+import sentencepiece as sp
+import argparse
+import os
+if __name__ == '__main__':
+    # Load arguments
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--load_path', default='../src/models/base_model/chinese_llama_7b/tokenizer_chinese.model', type=str)
+    parser.add_argument('--save_dir', default='../src/models/base_model/save_chinese', type=str)
+    parser.add_argument('--voc_path', default='../data/vocabulary/legal_vocab_processed.txt', type=str)
+    args = parser.parse_args()
+    LOAD_PATH = args.load_path
+    SAVE_DIR = args.save_dir
+    VOC_PATH = args.voc_path
+    # Load pre-trained llama tokenizer and sentencepiece model
+    llama_spm = model.ModelProto()
+    llama_spm.ParseFromString(open(LOAD_PATH, "rb").read())
+    # show size of llama's vocabulary
+    llama_spm_tokens_set = set(p.piece for p in llama_spm.pieces)
+    print(f"Size of initial llama's vocabulary: {len(llama_spm_tokens_set)}")
+    # Load custom vocabulary
+    new_tokens = open(VOC_PATH, "r").read().split("\n")
+    for token in new_tokens:
+        if token not in llama_spm_tokens_set:
+            new_token = model.ModelProto().SentencePiece()
+            new_token.piece = token
+            new_token.score = 0
+            llama_spm.pieces.append(new_token)
+    print(f"Size of merged llama's vocabulary: {len(llama_spm.pieces)}")
+    # save
+    os.makedirs(SAVE_DIR, exist_ok=True)
+    SAVE_MODEL_PATH = os.path.join(SAVE_DIR, 'tokenizer.model')
+    SAVE_VOCAB_PATH = os.path.join(SAVE_DIR, 'tokenizer.vocab')
+    with open(SAVE_MODEL_PATH, 'wb') as f:
+        f.write(llama_spm.SerializeToString())
+    with open(SAVE_VOCAB_PATH, 'w')  as f:
+        f.writelines([f'{token.piece} {token.score}\n' for token in llama_spm.pieces])
+    tokenizer = LlamaTokenizer(SAVE_MODEL_PATH)
+    tokenizer.save_pretrained(SAVE_DIR)
+    print(f'New llama tokenizer and spm has been saved to {SAVE_DIR}')
+    # test
+    llama_tokenizer_old = LlamaTokenizer.from_pretrained(LOAD_PATH)
+    llama_tokenizer_new = LlamaTokenizer.from_pretrained(SAVE_DIR)
+    text = '''登记错误赔偿责任登记等手续登记等手续生效登记机构和登记办法登记机构赔偿后登记机构应当提供登记收费问题'''
+    print(f'Size of old vocabulary: {llama_tokenizer_old.vocab_size}')
+    print(f'Size of new vocabulary: {llama_tokenizer_new.vocab_size}')
+    print('All special tokens and ids in new llama:')
+    print(llama_tokenizer_new.all_special_tokens)
+    print(llama_tokenizer_new.all_special_ids)
+    print(llama_tokenizer_new.special_tokens_map)
+    print(f'Text:\n{text}')
+    print(f'Tokenized by LLaMA tokenizer:\n {llama_tokenizer_old.tokenize(text)}')
+    print(f'Tokenized by NEW LLaMA tokenizer:\n {llama_tokenizer_new.tokenize(text)}')

train_clm.py ADDED Viewed

	@@ -0,0 +1,259 @@

+import os
+import sys
+from typing import List
+import fire
+import torch
+import transformers
+from datasets import load_dataset
+from peft import (
+    LoraConfig,
+    get_peft_model,
+    get_peft_model_state_dict,
+    prepare_model_for_int8_training,
+    set_peft_model_state_dict,
+)
+from transformers import LlamaForCausalLM, LlamaTokenizer
+from utils.prompter import Prompter
+def train(
+    # model/data params
+    base_model: str = "./models/base_models/your_base_model_dir",
+    data_path: str = "./data/your_data.json",
+    output_dir: str = "./outputs/your_version_dir",
+    # training hyperparams
+    batch_size: int = 128,
+    micro_batch_size: int = 4,
+    num_epochs: int = 10,
+    learning_rate: float = 3e-4,
+    cutoff_len: int = 512,
+    val_set_size: int = 2000,
+    # lora hyperparams
+    lora_r: int = 8,
+    lora_alpha: int = 16,
+    lora_dropout: float = 0.05,
+    lora_target_modules: List[str] = ["q_proj", "v_proj",],
+    # llm hyperparams
+    train_on_inputs: bool = True,  # if False, masks out inputs in loss
+    add_eos_token: bool = True,
+    group_by_length: bool = False,  # faster, but produces an odd training loss curve
+    # wandb params
+    wandb_project: str = "",
+    wandb_run_name: str = "",
+    wandb_watch: str = "",  # options: false | gradients | all
+    wandb_log_model: str = "",  # options: false | true
+    # either training checkpoint or final adapter
+    resume_from_checkpoint: str = None,
+    # The prompt template to use, will default to alpaca.
+    prompt_template_name: str = "alpaca",
+):
+    if int(os.environ.get("LOCAL_RANK", 0)) == 0:
+        print(
+            f"Training Alpaca-LoRA model with params:\n"
+            f"base_model: {base_model}\n"
+            f"data_path: {data_path}\n"
+            f"output_dir: {output_dir}\n"
+            f"batch_size: {batch_size}\n"
+            f"micro_batch_size: {micro_batch_size}\n"
+            f"num_epochs: {num_epochs}\n"
+            f"learning_rate: {learning_rate}\n"
+            f"cutoff_len: {cutoff_len}\n"
+            f"val_set_size: {val_set_size}\n"
+            f"lora_r: {lora_r}\n"
+            f"lora_alpha: {lora_alpha}\n"
+            f"lora_dropout: {lora_dropout}\n"
+            f"lora_target_modules: {lora_target_modules}\n"
+            f"train_on_inputs: {train_on_inputs}\n"
+            f"add_eos_token: {add_eos_token}\n"
+            f"group_by_length: {group_by_length}\n"
+            f"wandb_project: {wandb_project}\n"
+            f"wandb_run_name: {wandb_run_name}\n"
+            f"wandb_watch: {wandb_watch}\n"
+            f"wandb_log_model: {wandb_log_model}\n"
+            f"resume_from_checkpoint: {resume_from_checkpoint or False}\n"
+            f"prompt template: {prompt_template_name}\n"
+        )
+    gradient_accumulation_steps = batch_size // micro_batch_size
+    prompter = Prompter(prompt_template_name)
+    # Configure device and distributed training
+    device_map = "auto"
+    world_size = int(os.environ.get("WORLD_SIZE", 1))
+    ddp = world_size != 1
+    if ddp:
+        device_map = {"": int(os.environ.get("LOCAL_RANK") or 0)}
+        gradient_accumulation_steps = gradient_accumulation_steps // world_size
+    # Check if parameter passed or if set within environ
+    use_wandb = len(wandb_project) > 0 or (
+        "WANDB_PROJECT" in os.environ and len(os.environ["WANDB_PROJECT"]) > 0)
+    # Only overwrite environ if wandb param passed
+    if len(wandb_project) > 0:
+        os.environ["WANDB_PROJECT"] = wandb_project
+    if len(wandb_watch) > 0:
+        os.environ["WANDB_WATCH"] = wandb_watch
+    if len(wandb_log_model) > 0:
+        os.environ["WANDB_LOG_MODEL"] = wandb_log_model
+    model = LlamaForCausalLM.from_pretrained(
+        base_model,
+        load_in_8bit=True,
+        torch_dtype=torch.float16,
+        device_map=device_map,
+    )
+    tokenizer = LlamaTokenizer.from_pretrained(base_model)
+    tokenizer.bos_token_id = 1
+    tokenizer.eos_token_id = 2
+    bos = tokenizer.bos_token_id
+    eos = tokenizer.eos_token_id
+    pad = tokenizer.pad_token_id
+    print("pre-trained model's BOS EOS and PAD token id:",
+          bos, eos, pad, " => It should be 1,2,none")
+    tokenizer.pad_token_id = (
+        0  # unk. we want this to be different from the eos token
+    )
+    tokenizer.padding_side = "left"  # Allow batched inference
+    def tokenize(prompt, add_eos_token=True):
+        # there's probably a way to do this with the tokenizer settings
+        # but again, gotta move fast
+        result = tokenizer(
+            prompt,
+            truncation=True,
+            max_length=cutoff_len,
+            padding=False,
+            return_tensors=None,
+        )
+        if (
+            result["input_ids"][-1] != tokenizer.eos_token_id
+            and len(result["input_ids"]) < cutoff_len
+            and add_eos_token
+        ):
+            result["input_ids"].append(tokenizer.eos_token_id)
+            result["attention_mask"].append(1)
+        result["labels"] = result["input_ids"].copy()
+        return result
+    def generate_and_tokenize_prompt(data_point):
+        text = data_point['content']
+        tokenized_full_prompt = tokenize(text)
+        return tokenized_full_prompt
+    model = prepare_model_for_int8_training(model)
+    config = LoraConfig(
+        r=lora_r,
+        lora_alpha=lora_alpha,
+        target_modules=lora_target_modules,
+        lora_dropout=lora_dropout,
+        bias="none",
+        task_type="CAUSAL_LM",
+    )
+    model = get_peft_model(model, config)
+    if data_path.endswith(".json") or data_path.endswith(".jsonl"):
+        data = load_dataset("json", data_files=data_path)
+    else:
+        data = load_dataset(data_path)
+    if resume_from_checkpoint:
+        # Check the available weights and load them
+        checkpoint_name = os.path.join(
+            resume_from_checkpoint, "pytorch_model.bin"
+        )  # Full checkpoint
+        if not os.path.exists(checkpoint_name):
+            checkpoint_name = os.path.join(
+                resume_from_checkpoint, "adapter_model.bin"
+            )  # only LoRA model - LoRA config above has to fit
+            resume_from_checkpoint = (
+                False  # So the trainer won't try loading its state
+            )
+        # The two files above have a different name depending on how they were saved, but are actually the same.
+        if os.path.exists(checkpoint_name):
+            print(f"Restarting from {checkpoint_name}")
+            adapters_weights = torch.load(checkpoint_name)
+            set_peft_model_state_dict(model, adapters_weights)
+        else:
+            print(f"Checkpoint {checkpoint_name} not found")
+    # Be more transparent about the % of trainable params.
+    model.print_trainable_parameters()
+    if val_set_size > 0:
+        train_val = data["train"].train_test_split(test_size=val_set_size, shuffle=True, seed=42)
+        train_data = (train_val["train"].shuffle().map(generate_and_tokenize_prompt))
+        val_data = (train_val["test"].shuffle().map(generate_and_tokenize_prompt))
+    else:
+        train_data = data["train"].shuffle().map(generate_and_tokenize_prompt)
+        val_data = None
+    if not ddp and torch.cuda.device_count() > 1:
+        # keeps Trainer from trying its own DataParallelism when more than 1 gpu is available
+        model.is_parallelizable = True
+        model.model_parallel = True
+    trainer = transformers.Trainer(
+        model=model,
+        train_dataset=train_data,
+        eval_dataset=val_data,
+        args=transformers.TrainingArguments(
+            per_device_train_batch_size=micro_batch_size,
+            gradient_accumulation_steps=gradient_accumulation_steps,
+            warmup_steps=100,
+            num_train_epochs=num_epochs,
+            learning_rate=learning_rate,
+            fp16=True,
+            logging_steps=10,
+            optim="adamw_torch",
+            evaluation_strategy="steps" if val_set_size > 0 else "no",
+            save_strategy="steps",
+            eval_steps=100 if val_set_size > 0 else None,
+            save_steps=100,
+            output_dir=output_dir,
+            save_total_limit=3,
+            load_best_model_at_end=True if val_set_size > 0 else False,
+            ddp_find_unused_parameters=False if ddp else None,
+            group_by_length=group_by_length,
+            report_to="wandb" if use_wandb else None,
+            run_name=wandb_run_name if use_wandb else None,
+        ),
+        data_collator=transformers.DataCollatorForSeq2Seq(
+            tokenizer, pad_to_multiple_of=8, return_tensors="pt", padding=True
+        ),
+    )
+    model.config.use_cache = False
+    old_state_dict = model.state_dict
+    model.state_dict = (
+        lambda self, *_, **__: get_peft_model_state_dict(
+            self, old_state_dict()
+        )
+    ).__get__(model, type(model))
+    if torch.__version__ >= "2" and sys.platform != "win32":
+        model = torch.compile(model)
+    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
+    model.save_pretrained(output_dir)
+    print("\n If there's a warning about missing keys above, please disregard :)")
+if __name__ == "__main__":
+    fire.Fire(train)

utils/__init__.py ADDED Viewed

File without changes

utils/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (131 Bytes). View file

utils/__pycache__/callbacks.cpython-38.pyc ADDED Viewed

Binary file (2.65 kB). View file

utils/__pycache__/prompter.cpython-38.pyc ADDED Viewed

Binary file (1.61 kB). View file

utils/callbacks.py ADDED Viewed

	@@ -0,0 +1,75 @@

+"""
+Helpers to support streaming generate output.
+Borrowed from https://github.com/oobabooga/text-generation-webui/blob/ad37f396fc8bcbab90e11ecf17c56c97bfbd4a9c/modules/callbacks.py
+"""
+import gc
+import traceback
+from queue import Queue
+from threading import Thread
+import torch
+import transformers
+class Stream(transformers.StoppingCriteria):
+    def __init__(self, callback_func=None):
+        self.callback_func = callback_func
+    def __call__(self, input_ids, scores) -> bool:
+        if self.callback_func is not None:
+            self.callback_func(input_ids[0])
+        return False
+class Iteratorize:
+    """
+    Transforms a function that takes a callback
+    into a lazy iterator (generator).
+    """
+    def __init__(self, func, kwargs={}, callback=None):
+        self.mfunc = func
+        self.c_callback = callback
+        self.q = Queue()
+        self.sentinel = object()
+        self.kwargs = kwargs
+        self.stop_now = False
+        def _callback(val):
+            if self.stop_now:
+                raise ValueError
+            self.q.put(val)
+        def gentask():
+            try:
+                ret = self.mfunc(callback=_callback, **self.kwargs)
+            except ValueError:
+                pass
+            except:
+                traceback.print_exc()
+                pass
+            self.q.put(self.sentinel)
+            if self.c_callback:
+                self.c_callback(ret)
+        self.thread = Thread(target=gentask)
+        self.thread.start()
+    def __iter__(self):
+        return self
+    def __next__(self):
+        obj = self.q.get(True, None)
+        if obj is self.sentinel:
+            raise StopIteration
+        else:
+            return obj
+    def __enter__(self):
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.stop_now = True

utils/evaluate.py ADDED Viewed

	@@ -0,0 +1,196 @@

+import math
+import os
+import sys
+import fire
+from tqdm import tqdm
+import pandas as pd
+import torch
+import transformers
+from peft import PeftModel
+import datasets
+from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
+from utils.callbacks import Iteratorize, Stream
+from utils.prompter import Prompter
+device = "cuda"
+def main(
+    load_8bit: bool = True,
+    base_model: str = "decapoda-research/llama-7b-hf",
+    lora_weights: str = "./lora-alpaca",
+    data_path: str = "./data",
+    output_path: str = "./output",
+    eval_rate: float = 0.1,
+    batch_size: int = 32,
+    # The prompt template to use, will default to alpaca.
+    prompt_template: str = "alpaca",
+):
+    base_model = base_model or os.environ.get("BASE_MODEL", "")
+    assert (base_model), "Please specify a --base_model, e.g. --base_model='huggyllama/llama-7b'"
+    prompter = Prompter(prompt_template)
+    tokenizer = LlamaTokenizer.from_pretrained(base_model)
+    if device == "cuda":
+        model = LlamaForCausalLM.from_pretrained(
+            base_model,
+            load_in_8bit=load_8bit,
+            torch_dtype=torch.float16,
+            device_map="auto",
+        )
+        model = PeftModel.from_pretrained(
+            model,
+            lora_weights,
+            torch_dtype=torch.float16,
+        )
+    # unwind broken decapoda-research config
+    model.config.pad_token_id = tokenizer.pad_token_id = 0  # unk
+    model.config.bos_token_id = 1
+    model.config.eos_token_id = 2
+    if not load_8bit:
+        model.half()  # seems to fix bugs for some users.
+    model.eval()
+    if torch.__version__ >= "2" and sys.platform != "win32":
+        model = torch.compile(model)
+    def evaluate_one(
+        instruction,
+        input=None,
+        temperature=0.1,
+        top_p=0.75,
+        top_k=40,
+        num_beams=2,
+        max_new_tokens=128,
+        **kwargs,
+    ):
+        prompt = prompter.generate_prompt(instruction, input)
+        inputs = tokenizer(prompt, return_tensors="pt")
+        input_ids = inputs["input_ids"].to(device)
+        generation_config = GenerationConfig(
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            num_beams=num_beams,
+            **kwargs,
+        )
+        # Without streaming
+        with torch.no_grad():
+            generation_output = model.generate(
+                input_ids=input_ids,
+                generation_config=generation_config,
+                return_dict_in_generate=True,
+                output_scores=True,
+                max_new_tokens=max_new_tokens,
+            )
+        s = generation_output.sequences[0]
+        output = tokenizer.decode(s, skip_special_tokens=True)
+        return prompter.get_response(output)
+    def evaluate_all():
+        # data = datasets.load_dataset("json", data_files=data_path)
+        # data = data["train"]
+        # df = data.to_pandas()
+        df = pd.read_json(data_path, orient='records')
+        print(df.info())
+        # 计算准确率
+        correct = 0
+        total = 0
+        total_step = len(df)
+        pbar = tqdm(total=total_step, unit='batch')
+        error = []
+        for i in range(total_step):
+            instruction = df['instruction'].iloc[i]
+            input = df['input'].iloc[i]
+            label = df['output'].iloc[i]
+            pred = evaluate_one(instruction=instruction, input=input)
+            if pred == label:
+                correct += 1
+            else:
+                error.append((label, pred))
+            total += 1
+            acc = correct / total
+            # 更新进度条
+            # Update the progress bar
+            pbar.set_description(
+                f"Testing: Sample [{total}/{total_step}] Acc: {acc :.4f}")
+            pbar.update(1)
+        for e in error:
+            print(e)
+    def evaluate_by_batch(
+        temperature=0.1,
+        top_p=0.75,
+        top_k=40,
+        num_beams=1,
+        max_new_tokens=32
+    ):
+        df = pd.read_json(data_path, orient='records')
+        # df = df.sample(frac=eval_rate).reset_index(drop=True)
+        df['prompt'] = df.apply(lambda x: prompter.generate_prompt(
+            x['instruction'], x['input']), axis=1)
+        tokenizer.padding_side = "left"  # Allow batched inference
+        generation_config = GenerationConfig(
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            num_beams=num_beams
+        )
+        outputs = []
+        total = 0
+        total_step = math.ceil(len(df) / batch_size)
+        pbar = tqdm(total=total_step, unit='batch')
+        # 计算准确率
+        with torch.no_grad():
+            for i in range(total_step):
+                batch = df.iloc[i*batch_size:(i+1)*batch_size]
+                inputs = tokenizer(batch['prompt'].tolist(), return_tensors="pt", padding=True)[
+                    'input_ids'].to(device)
+                generation_outputs = model.generate(
+                    input_ids=inputs,
+                    generation_config=generation_config,
+                    max_new_tokens=max_new_tokens,
+                    pad_token_id=tokenizer.pad_token_id
+                )
+                for g in generation_outputs:
+                    decoded_item = tokenizer.decode(
+                        g, skip_special_tokens=True)
+                    try:
+                        output = prompter.get_response(decoded_item)
+                    except:
+                        output = decoded_item
+                    outputs.append(output)
+                    total += 1
+                # 更新进度条
+                pbar.set_description(f"Testing: Sample [{total}/{len(df)}] ")
+                pbar.update(1)
+        df['pred'] = outputs
+        df['pred'].to_csv(output_path, index=False)
+    evaluate_by_batch()
+if __name__ == "__main__":
+    # fire.Fire(main)
+    import yaml
+    dataset_param = sys.argv[1]
+    with open("./configs/evaluate_params.yaml", "r") as stream:
+        # try:
+        params = yaml.safe_load(stream)
+        print('=' * 80)
+        print(params[dataset_param])
+        print('=' * 80)
+    # fire.Fire(train)
+    main(**params[dataset_param])

utils/merge.py ADDED Viewed

	@@ -0,0 +1,51 @@

+import os
+import torch
+import transformers
+from peft import PeftModel
+from transformers import LlamaForCausalLM, LlamaTokenizer  # noqa: F402
+BASE_MODEL = os.environ.get("BASE_MODEL", None)
+assert (
+    BASE_MODEL
+), "Please specify a value for BASE_MODEL environment variable, e.g. `export BASE_MODEL=huggyllama/llama-7b`"  # noqa: E501
+tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL)
+base_model = LlamaForCausalLM.from_pretrained(
+    BASE_MODEL,
+    load_in_8bit=False,
+    torch_dtype=torch.float16,
+    device_map={"": "cpu"},
+)
+first_weight = base_model.model.layers[0].self_attn.q_proj.weight
+first_weight_old = first_weight.clone()
+lora_model = PeftModel.from_pretrained(
+    base_model,
+    "../outputs/lora-llama-clm-e2",
+    device_map={"": "cpu"},
+    torch_dtype=torch.float16,
+)
+lora_weight = lora_model.base_model.model.model.layers[0].self_attn.q_proj.weight
+assert torch.allclose(first_weight_old, first_weight)
+# merge weights - new merging method from peft
+lora_model = lora_model.merge_and_unload()
+lora_model.train(False)
+# did we do anything?
+assert not torch.allclose(first_weight_old, first_weight)
+lora_model_sd = lora_model.state_dict()
+deloreanized_sd = {
+    k.replace("base_model.model.", ""): v
+    for k, v in lora_model_sd.items()
+    if "lora" not in k
+}
+LlamaForCausalLM.save_pretrained(base_model, '../models/legal-base-7b', state_dict=deloreanized_sd, max_shard_size="400MB")

utils/prompter.py ADDED Viewed

	@@ -0,0 +1,51 @@

+"""
+A dedicated helper to manage templates and prompt building.
+"""
+import json
+import os.path as osp
+from typing import Union
+class Prompter(object):
+    __slots__ = ("template", "_verbose")
+    def __init__(self, template_name: str = "", verbose: bool = False):
+        self._verbose = verbose
+        if not template_name:
+            # Enforce the default here, so the constructor can be called with '' and will not break.
+            template_name = "alpaca"
+        file_name = osp.join("templates", f"{template_name}.json")
+        if not osp.exists(file_name):
+            raise ValueError(f"Can't read {file_name}")
+        with open(file_name) as fp:
+            self.template = json.load(fp)
+        if self._verbose:
+            print(
+                f"Using prompt template {template_name}: {self.template['description']}"
+            )
+    def generate_prompt(
+        self,
+        instruction: str,
+        input: Union[None, str] = None,
+        label: Union[None, str] = None,
+    ) -> str:
+        # returns the full prompt from instruction and optional input
+        # if a label (=response, =output) is provided, it's also appended.
+        if input:
+            res = self.template["prompt_input"].format(
+                instruction=instruction, input=input
+            )
+        else:
+            res = self.template["prompt_no_input"].format(
+                instruction=instruction
+            )
+        if label:
+            res = f"{res}{label}"
+        if self._verbose:
+            print(res)
+        return res
+    def get_response(self, output: str) -> str:
+        return output.split(self.template["response_split"])[1].strip()

webui.py ADDED Viewed

	@@ -0,0 +1,191 @@

+import os
+import sys
+import fire
+import gradio as gr
+import torch
+import transformers
+from peft import PeftModel
+from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer, AutoModel, AutoTokenizer, AutoModelForCausalLM,AutoConfig
+from utils.callbacks import Iteratorize, Stream
+from utils.prompter import Prompter
+# if torch.cuda.is_available():
+#     device = "cuda"
+# else:
+#     device = "cpu"
+# try:
+#     if torch.backends.mps.is_available():
+#         device = "mps"
+# except:
+#     pass
+device = "xpu"
+def main(
+    load_8bit: bool = False,
+    base_model: str = "",
+    lora_weights: str = "",
+    prompt_template: str = "",  # The prompt template to use, will default to alpaca.
+    server_name: str = "0.0.0.0",  # Allows to listen on all interfaces by providing '0.
+    share_gradio: bool = False,
+):
+    base_model = base_model or os.environ.get("BASE_MODEL", "")
+    assert (
+        base_model
+    ), "Please specify a --base_model, e.g. --base_model='huggyllama/llama-7b'"
+    prompter = Prompter(prompt_template)
+    tokenizer = LlamaTokenizer.from_pretrained(base_model)
+    prompter = Prompter(prompt_template)
+    tokenizer = LlamaTokenizer.from_pretrained(base_model)
+    config = AutoConfig.from_pretrained(base_model)
+    model = AutoModelForCausalLM.from_pretrained(
+        base_model,
+        config=config,
+        load_in_8bit=load_8bit,
+        torch_dtype=torch.float16,
+        device_map="auto",
+    )
+    try:
+        print(f"Using lora {lora_weights}")
+        model = PeftModel.from_pretrained(
+            model,
+            lora_weights,
+            torch_dtype=torch.float16,
+        )
+    except:
+        print("*"*50, "\n Attention! No Lora Weights \n", "*"*50)
+    # unwind broken decapoda-research config
+    model.config.pad_token_id = tokenizer.pad_token_id = 0  # unk
+    model.config.bos_token_id = 1
+    model.config.eos_token_id = 2
+    # if not load_8bit:
+    #     model.half()  # seems to fix bugs for some users.
+    model.eval()
+    if torch.__version__ >= "2" and sys.platform != "win32":
+        model = torch.compile(model)
+    def evaluate(
+        instruction,
+        # input=None,
+        temperature=0.1,
+        top_p=0.75,
+        top_k=40,
+        num_beams=4,
+        max_new_tokens=128,
+        stream_output=False,
+        **kwargs,
+    ):
+        input=None
+        prompt = prompter.generate_prompt(instruction, input)
+        inputs = tokenizer(prompt, return_tensors="pt")
+        input_ids = inputs["input_ids"].to(device)
+        generation_config = GenerationConfig(
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            num_beams=num_beams,
+            **kwargs,
+        )
+        generate_params = {
+            "input_ids": input_ids,
+            "generation_config": generation_config,
+            "return_dict_in_generate": True,
+            "output_scores": True,
+            "max_new_tokens": max_new_tokens,
+        }
+        if stream_output:
+            # Stream the reply 1 token at a time.
+            # This is based on the trick of using 'stopping_criteria' to create an iterator,
+            # from https://github.com/oobabooga/text-generation-webui/blob/ad37f396fc8bcbab90e11ecf17c56c97bfbd4a9c/modules/text_generation.py#L216-L243.
+            def generate_with_callback(callback=None, **kwargs):
+                kwargs.setdefault(
+                    "stopping_criteria", transformers.StoppingCriteriaList()
+                )
+                kwargs["stopping_criteria"].append(
+                    Stream(callback_func=callback)
+                )
+                with torch.no_grad():
+                    model.generate(**kwargs)
+            def generate_with_streaming(**kwargs):
+                return Iteratorize(
+                    generate_with_callback, kwargs, callback=None
+                )
+            with generate_with_streaming(**generate_params) as generator:
+                for output in generator:
+                    # new_tokens = len(output) - len(input_ids[0])
+                    decoded_output = tokenizer.decode(output)
+                    if output[-1] in [tokenizer.eos_token_id]:
+                        break
+                    yield prompter.get_response(decoded_output)
+            print(decoded_output)
+            return  # early return for stream_output
+        # Without streaming
+        with torch.no_grad():
+            generation_output = model.generate(
+                input_ids=input_ids,
+                generation_config=generation_config,
+                return_dict_in_generate=True,
+                output_scores=True,
+                max_new_tokens=max_new_tokens,
+            )
+        s = generation_output.sequences[0]
+        output = tokenizer.decode(s)
+        print(output)
+        yield prompter.get_response(output)
+    gr.Interface(
+        fn=evaluate,
+        inputs=[
+            gr.components.Textbox(
+                lines=2,
+                label="Instruction",
+                placeholder="此处输入法律相关问题",
+            ),
+            # gr.components.Textbox(lines=2, label="Input", placeholder="none"),
+            gr.components.Slider(
+                minimum=0, maximum=1, value=0.1, label="Temperature"
+            ),
+            gr.components.Slider(
+                minimum=0, maximum=1, value=0.75, label="Top p"
+            ),
+            gr.components.Slider(
+                minimum=0, maximum=100, step=1, value=40, label="Top k"
+            ),
+            gr.components.Slider(
+                minimum=1, maximum=4, step=1, value=1, label="Beams"
+            ),
+            gr.components.Slider(
+                minimum=1, maximum=2000, step=1, value=256, label="Max tokens"
+            ),
+            gr.components.Checkbox(label="Stream output",  value=True),
+        ],
+        outputs=[
+            gr.inputs.Textbox(
+                lines=8,
+                label="Output",
+            )
+        ],
+        title="🦙🌲 LaWGPT",
+        description="",
+    ).queue().launch(server_name="0.0.0.0", share=share_gradio)
+if __name__ == "__main__":
+    fire.Fire(main)