--- license: llama3.1 --- ## Overview This is a Hermes 3 Llama-3.1 8b model that has been trained specifically on generating knowledge graph triples. It was trained on a variety of open source generic data for generating KGs and then follow on trained on synthetic data generated from 100k random samples of Prime-KG. This was done primarily as an experiment to see how well these models might be used for this task. Model was trained using the wonderful open source package Unsloth. The model is this repo is trained on the same data as M-Chimiste/Llama-3-8B-prime-graph-exp-1_merged but retrained with Hermes 3 for the added capability of Hermes 3 and the 128k context length of llama-3.1. ## Limitations Extensive testing of this model has not been completed. It would be prudent to conduct your own evaluations of this model before using it for anything outside of an experiment or proof of concept. ## Example Usage ```python from unsloth import FastLanguageModel import torch max_seq_length = 8092 dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False. model, tokenizer = FastLanguageModel.from_pretrained( model_name = "theseus-research/llama-3.1-8b-prime-kg-exp-1", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit, ) # Text Source (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC302039/) text = f""" Regulation of HIF-1α protein stability by the VHL tumor suppressor protein We and others have recently demonstrated that HIF-1α is regulated by the ubiquitin-proteasome pathway under normoxic conditions, resulting in very rapid turnover of the protein, and that one of the early responses to hypoxia is massive upregulation of HIF-1α protein levels (Kallio et al., 1997; Salceda and Caro, 1997; Huang et al., 1998; Kallio et al., 1999). Interestingly, VHL protein complexes have recently been demonstrated to harbor E3 ubiquitin-protein ligase activity, although the target protein for this activity has not yet been identified (Lisztwan et al., 1999; Iwai et al., 1999). Moreover, HIF-1α-regulated target genes such as VEGF are constitutively expressed at normoxia in VHL-deficient cells (Gnarra et al., 1996; Iliopoulos et al., 1996), indicating dysregulation of HIF-1α function in these cells. We were thus interested to investigate the potential mechanism of regulation of HIF-1α function by VHL. Due to the pronounced lability of the HIF-1α protein under normoxic conditions, HIF-1α is normally not detectable by immunoblot analysis of cellular extracts (Kallio et al., 1997, 1998). It was therefore not possible to investigate the effect of expression of VHL on endogenous HIF-1α protein levels. To establish experimental conditions to examine the effect of VHL on HIF-1α protein stability, we transiently transfected COS7 cells with FLAG epitope-tagged HIF-1α expression plasmids. As expected, at a low concentration (0.2 µg) of expression vector, HIF-1α was not detected at normoxia, and we observed potent stabilization of HIF-1α protein levels at hypoxia, as assessed by immunoblot analysis (Figure 1A). However, at higher concentrations (0.5–1 µg) of expression vector we could detect HIF-1α protein expression also under normoxic conditions (Figure 1A). At the highest concentrations of expression vector tested we observed significant HIF-1α expression levels both at normoxia and hypoxia (Figure 1A). Thus, these experiments suggest that the mechanism of degradation of HIF-1α had become saturated under these conditions and that one or several components of the degradation machinery were limiting. We next used the high level HIF-1α expression conditions for all subsequent experiments to examine the effect of VHL on HIF-1α protein levels. Transient coexpression of FLAG/HIF-1α and VHL resulted in reduction of the HIF-1α protein signal under normoxic conditions (Figure 1B, upper panel), indicating that VHL may have been limiting under the conditions of expression of HIF-1α alone. Interestingly, VHL failed to induce reduction of HIF-1α protein levels under hypoxic conditions (Figure 1B). In control experiments we detected similar levels of VHL expression in extracts from either normoxic or hypoxic cells (Figure 1B, lower panel). VHL-induced reduction of HIF-1α protein levels at normoxia was inhibited by treatment of the cells with the proteasome inhibitor MG-132 (Figure 1C). Taken together, these results strongly suggest that VHL mediates proteasomal degradation of HIF-1α. This effect of VHL was specific for HIF-1α, as transiently expressed VHL did not produce this effect on FLAG-tagged dioxin receptor (Figure 1D), a basic helix–loop–helix(bHLH)/PAS (Per/Arnt/Sim domain) protein belonging to the same class of transcription factors as HIF-1α. Two domains of VHL are required for inducing protein degradation of HIF-1α Given the potential role of VHL as an E3 ubiquitin ligase we examined whether VHL physically interacted with HIF-1α. 35S-labeled, in vitro translated HIF-1α was incubated with wild-type or mutant GAL4/VHL fusion proteins (schematically represented in Figure 2A) or the minimal GAL4 DNA binding domain alone prior to immunoprecipitation assays. In these experiments, 35S-labeled HIF-1α was co-immunoprecipitated in the presence of GAL4/VHL by anti-GAL4 specific antibodies, whereas no interaction was observed between HIF-1α and the minimal GAL4 DNA binding domain (Figure 2B, upper panel). Non-specific pre-immune rabbit antiserum did not precipitate HIF-1α protein in the presence of either VHL or GAL4 alone (Figure 2B, lower panel), indicating that wild-type VHL specifically interacted with HIF-1α in vitro. The VHL Δ114–154 deletion mutant showed interaction with HIF-1α, whereas the VHL 114–154 fragment failed to do so (Figure 2B). A 23 amino acid-long N-terminal extension of this fragment generated VHL 91–154, which was able to interact with HIF-1α, indicating the importance of a structure located between residues 91 and 113 of VHL to interact with HIF-1α. Interestingly, this region of VHL is not only contained within the putative macromolecular binding site observed in the crystal structure of the VHL–BC complex (Stebbins et al., 1999), but also represents one of the mutational hotspots in tumors (Kaelin and Maher, 1998). This fact prompted us to examine whether tumor-derived mutations of VHL would affect its ability to interact with HIF-1α and/or to induce HIF-1α degradation. We performed these experiments using GAL4/VHL fusion proteins containing either a Y98N (the most frequent tumor mutation in this region; Kaelin and Maher, 1998) or a C162F single amino acid mutation. The C162F mutation has been demonstrated to render VHL unable to bind the elongin B–C complex (Lonergan et al., 1998; Lisztwan et al., 1999), and inhibit ubiquitin ligase activity in vitro (Lisztwan et al., 1999). In co-immunoprecipitation experiments, VHL Y98N was unable to interact with HIF-1α, whereas VHL C162F showed wild-type levels of interaction with HIF-1α (Figure 2C, upper panel). In our cellular degradation assay we transiently expressed at normoxia FLAG/HIF-1α in the presence or absence of wild-type or the individual point-mutated forms of VHL. Immunoblot analysis demonstrated that, in contrast to wild-type VHL, both the VHL Y98N and VHL C162F mutants failed to induce degradation of HIF-1α at normoxia (Figure 2D). These results demonstrate that both the HIF-1α interaction domain and the elongin C binding domain of VHL are necessary to mediate degradation of HIF-1α, and that regulation of HIF-1α may be involved in the tumor suppressor function of VHL. The oxygen-dependent degradation domain of HIF-1α is targeted for regulation by VHL To identify the domain of HIF-1α that is targeted by VHL to mediate proteasomal degradation at normoxia, we transiently expressed in COS7 cells in the presence or absence of VHL either wild-type FLAG/HIF-1α or a series of FLAG-tagged HIF-1α deletion mutants. In analogy to wild-type HIF-1α, HIF-1α 1–652 lacking the C-terminus including the C-terminal transactivation domain (schematically represented in Figure 3A) was degraded in the presence of VHL. However, the protein levels of HIF-1α 1–330 lacking structures C-terminal of the PAS domain were not affected by VHL. HIF-1α 526–826 lacking N-terminal structures (including the bHLH and PAS domains) was also degraded upon exposure to VHL at normoxia (Figure 3A). In conclusion, these results indicate that a C-terminal region of HIF-1α spanning residues 526–652 mediated VHL-dependent degradation.""" content = f"""Create RDF triples for the content below enclosed in <>. Try to make global triples to summarize the content. Keep subject, predicate, and object as simple as possible.\n<{text}>""" # You can guide the model by changing how granular you ask it to create triples. messages = [ {"role": "system", "content": """You are an expert knowledge graph annotator and you respond in JSON. Here's the JSON schema you must adhere to where each element is a new triple if needed:\n\n[{"subject": str, "predicate": str, "object": str},...{"subject": str, "predicate": str, "object": str}]\n"""}, {"role": "user", "content": content} ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) # outputs = model.generate(**input_ids, max_new_tokens = 512, use_cache = True) outputs = model.generate( input_ids, max_new_tokens=1024, eos_token_id=tokenizer.convert_tokens_to_ids("<|im_end|>"), do_sample=True, use_cache = True, temperature=0.6, # top_p=0.9, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)). ``` ## Example Output: ```json [ { "subject": "HIF-1α protein", "predicate": "regulated by", "object": "VHL tumor suppressor protein" }, { "subject": "HIF-1α protein", "predicate": "regulated via", "object": "ubiquitin-proteasome pathway" }, { "subject": "VHL protein complexes", "predicate": "possess", "object": "E3 ubiquitin-protein ligase activity" }, { "subject": "VHL protein", "predicate": "interacts with", "object": "HIF-1α protein" }, { "subject": "VHL protein", "predicate": "degradation domain", "object": "HIF-1α oxygen-dependent degradation domain" }, { "subject": "VHL Y98N mutation", "predicate": "impairs", "object": "interaction with HIF-1α protein" }, { "subject": "VHL C162F mutation", "predicate": "impairs", "object": "ubiquitin ligase activity" }, { "subject": "VHL protein", "predicate": "required for", "object": "degradation of HIF-1α protein" }, { "subject": "HIF-1α protein", "predicate": "regulated in", "object": "tumor cells" }, { "subject": "VHL protein", "predicate": "role in", "object": "tumor suppressor function" } ]```