Files
corelang/core_main.cpp
2022-10-09 10:47:33 +02:00

300 lines
12 KiB
C++

/*
Current:
- [ ] Foreign import that would link library
- [ ] String in Language.core
- [ ] Way to import and force evaluate #import_lazy #import ?
- [ ] Unix port
- [ ] Imports are leaking names ! Multimedia leaks windows stuff
- [ ] Test and bulletproof any, slices
Memory:
- [ ] Redesign Type map to use List and reduce wasting space
- [ ] Redesign lexing to minimize memory usage, we got rid of heap but in a naive way!
In the future
- [ ] Cleanup
- [ ] Add ability to do i: int = 0 inside for loops for i: int = 0, i < 10, i+=1
- [ ] Complicated c declaration generation
- [ ] Expand macros
- [ ] Defer
- [ ] Basic
- [ ] Detecting if return was called
- [ ] Builtin data structures
- [ ] Slices
- [ ] Some way to take slice of data
- [ ] Tuples
- [ ] Dynamic arrays
- [ ] Hash tables
- [ ] Programming constructs
- [ ] Using language construct
- [ ] Named loops and breaking out of them
- [ ] Bytecode interpreter
- [ ] Ir
- [ ] Interpreter
- [ ] Code generation
- [ ] Parametric Polymorphism
Ideas
- [ ] #test construct that would gather all tests and run them on start of program or something
- [ ] Inject stack traces into the program
- [ ] Constant arrays that evaluate fully at compile time
- [ ] Rust like enum where you associate values(other structs) with key
- [ ] Cast from array to pointer?
- [ ] Ternary operator?
- [ ] Optionally pass size and alignment calculations to C ?
-------------------------------------------------------------------------------
2022.09.29 - Function overloads, operator overloads, namespaces idea
I thought about adding function overloading but I'm not sure if that's
actually good. It seems pretty hard to implement and there are many problems
with it. It's probably better to implement descent namespacing mechanism.
Currently I'm thinking of something like this: operator overloads are this special
additional thing. They can be overloaded cause they are handled differently.
You are not looking up names, you are looking into an array of operators
during resolution of binary expressions etc. It shouldn't complicate things, hopefully.
-------------------------------------------------------------------------------
2022.05.28 - On lambda expressions
I think it's not wise to add the multiline lambda expressions
As is the case with python it would require new alternative syntax.
The idea was to imply that the whitespace significant syntax is just
inserting '{' '}' ';' automatically and if you decide to write a brace
it stops being whitespace significant and you can type everything with semicolons and braces
Problem is first of all it's kind of weird to have a completly different syntax
in a language to solve something that is a minor inconvenience,
second of all it turned out to be kind of bad, maybe if it would be more
complicated then it would be ok but it implies that you have to semicolon end
a lot of thing unintuitively.
Probably single line lambdas should be ok. Currently braces turn off indentation
awareness. There might be something you can do by trying to turn that on
problem is that complicates parsing a lot cause you have to account for multiple
indent styles, which means error messages become bad.
-------------------------------------------------------------------------------
2022.05.30 - On constructors and compound expressions
I unified the compound expression syntax (Type){} with function call syntax Type().
It's differentiating on whether you used type. I think there is a benefit to the language
not having an idea of overloading the type constructor. You will always know what will happen.
Unlike C++ for example. It seems like a minefield that can fuck your mind up. So many corner cases
and variants. having the idea of compounds doing one thing is reassuring I think.
You can always do a constructor by writing a function with lower case type if you want that.
For now I don't thing it should be overloadable.
-------------------------------------------------------------------------------
Imports
We compile lot's of files, we keep track of them in Parse_Ctx, making sure we don't
parse same thing twice. Files belong to a module, files can be loaded #load "file".
All the files see all the decls from all the files in that module. We can import
other modules using a different directive #import. #import perhaps should be lazily
evaluated, making sure we don't resolve stuff we don't require. Currently probably
want to export all the symbols, we can namespace them optionally.
2022.06.26 - import revision
Current design is a bit weird, there can be situations where you have colissions and stuff.
That's not cool. I was thinking of bringing back this idea where you have modules, implicit or
explicit(module directive at top of the file). Then each file that belongs to a module sees other
decls in that module and stuff like that. Other modules would be imported using a single keyword.
BUT on second thought I don't really like that design. It's not really clear how files that
belong to a module are found and concatenated. Then you would need to specify a directory to
compile instead of a single file and stuff like that which seems annoying! Current idea is
to just disallow having same file loaded twice! It's the same result as with the module stuff,
we cant reuse files, but we avoid file colissions where we load a file and a module we import loads
a file. I like the load files approach to a module cause we specify which files to compile and load.
Imports also could be this other thing where we load lazily and we decrease the amount of code
that we output. Code in our project shouldn't be evaluated lazily cause that's counter intuitive.
For modules it's a bit different cause they should be distributed as valid.
-------------------------------------------------------------------------------
## Done
- [x] Conditional compilation, maybe based on some pattern matching
- [x] #import "$OS.core"
- [x] You can't alias Lambdas because they are not evaluated as constant.
I used a little simplification where lambdas and structs were marked as such
in parsing I think. BUT Structs work so it's maybe just a little fix of constant
propagation using Operands, where I will need to modify Operand of lambda to
be constant AND rewrite_const to not rewrite a lambda OR SOMETHING, maybe it's cool
dont know. BUT not sure if we wont need to rewrite the idea that Lambdas can be Decls.
- [x] Operator Overloading
- [x] '.' Operator doesn't handle expressions inside the dot chain, no good, so casts don't work
- [x] Introduce List to reduce heap allocations and make it more arena friendly, can we get rid of heap completly?
- [x] Function renaming to prevent colissions, we can't really touch other stuff cause I want it to be easily debuggable
- [x] Fix Length etc. they should be function calls not operators
- [x] Idea to fix overshoot when debugging and it goes to the close bracket and there is not enough line directives. Store the last outputed line and propagate it on the close brace etc.
- [x] Disable .len for Strings, are there other things that use this convention?
- [x] Calculate size and alignment of struct data types
- [x] Consider changing syntax of scopes to use braces { } NO
- [x] Disable ability to parse inner structs, functions, constants etc. ?
- [x] Fix language_basics.kl string index error
- [x] Type as a parameter to a function, alloc :: (size: U64, type: Type)
- [x] Add token information to instructions
- [-] Mixing loads and imports leads to code duplication, is that what we want???
- [x] print those token lines nicely
- [x] Improve the python metaprogram
- [x] Implementing type Any
- [x] Runtime TypeInfo
- [x] Proper type Type support
- [x] Switch
- [x] Type aliases :: should probably be strictly typed, but assigning constant values should work
- [x] Array of inferred size
- [x] Casting pointers to and from void should be implicit
- [x] Multiple return values
- [x] Add c string
- [-] Should compound resolution use an algorithm to reorder compounds to initialize all fields in order
- [x] slices should be properly displayed in debugger
- [x] Imports inside of import shouldn't spill outside
- [x] Scope
- [x] #Assert that handles constants at compile time and vars at runtime
- [x] Hex 0x42
- [x] Rewrite where # happen,
- [x] elif
- [x] cast ->
- [x] Remodel compound from call to {}
- [x] Fix codegen renames
- [x] Field access rewrite
- [-] Constants embeded in structs should be able to refer to other constants in that namespace without prefix
- [-] Order independent constants in structs
- [-] Fix recursive lambdas in structs
- [x] Error message when file not found
- [x] Better error messages when type difference
- [-] Fixing access to functions/structs, in C we cant have functons inside of structs / functions so we need to rewrite the tree
- [x] Emitting #line
- [x] Making sure debugger works
- [x] We need ++ -- operators
- [x] Order independent declarations in structs ? NO MORE
- [x] Arrays with size passed
- [x] Values inited to 0 by default
- [x] Some way to call foreign functions
- [x] We are parsing wrong here: (t.str=(&string_to_lex.str)[i]);
- [x] Test new operators, add constant eval for them
- [x] lvalue, rvalue concept so we cant assign value to some arbitrary weird expression
- [x] Passing down program to compile through command line
- [x] More basic types
- [x] Implementing required operations int128
- [x] Fix casting
- [x] More for loop variations
- [x] Add basic support for floats
- [x] Converting from U64 token to S64 Atom introduces unnanounced error (negates) - probably need big int
- [x] Add basic setup for new type system
- [x] Access through struct names to constants Arena.CONSTANT
- [x] Enums
- [x] Make sure pointer arithmetic works
- [x] Initial for loop
- [x] Enum . access to values
- [x] Character literal
- [x] Compiling and running a program
- [x] Infinite for loop
- [x] in new typesystem: Fix calls, fix all example programs
- [x] Fix arithmetic operations in new type system
- [x] Init statements, different kinds [+=] [-=] etc.
- [x] Struct calls
- [x] Operators: Bit negation, Not
- [x] Default values in calls
- [x] Resolving calls with default values
- [x] Pass statement
- [x] Lexer: Need to insert scope endings when hitting End of file
- [x] Resolving calls with named args, with indexed args
- [x] Structs
- [x] Struct field access
- [x] Struct field access with dots while compiling to arrows in c
- [x] Typespecs should probably be expressions so stuff like would be possible :: \*[32]int
- [x] Initial order independence algorithm
- [x] Think about compound expressions, unify with calls - maybe Thing(a=1) instead of Thing{a=1}
*/
#include "base.cpp"
#include "base_unicode.cpp"
#include "os.h"
#if OS_WINDOWS
#include "os_windows.cpp"
#elif OS_UNIX
#include "os_unix.cpp"
#else
#error Couldnt figure out OS using macros
#endif
#include "c3_big_int.h"
#include "core_compiler.h"
#include "core_types.h"
#include "core_globals.cpp"
#include "core_generated.cpp"
#include "c3_big_int.cpp"
#include "core_lexing.cpp"
#include "core_ast.cpp"
#include "core_parsing.cpp"
#include "core_typechecking.h"
#include "core_types.cpp"
#include "core_typechecking.cpp"
#include "core_compiler.cpp"
#include "core_codegen_c_language.cpp"
int main(int argument_count, char **arguments){
#if OS_WINDOWS
// Set output mode to handle virtual terminal sequences
HANDLE hOut = GetStdHandle(STD_OUTPUT_HANDLE);
if (hOut == INVALID_HANDLE_VALUE) {
return GetLastError();
}
DWORD dwMode = 0;
if (!GetConsoleMode(hOut, &dwMode)) {
return GetLastError();
}
dwMode |= ENABLE_VIRTUAL_TERMINAL_PROCESSING;
if (!SetConsoleMode(hOut, dwMode)) {
return GetLastError();
}
#endif
#if OS_WINDOWS
test_os_memory();
#endif
thread_ctx_init();
test_unicode();
map_test();
test_string_builder();
test_intern_table();
// emit_line_directives = false;
// emit_type_info = false;
if(argument_count > 1){
String program_name = string_from_cstring(arguments[1]);
compile_file(program_name, COMPILE_PRINT_STATS);
}
else {
Scratch scratch;
Array<OS_File_Info> examples = os_list_dir(scratch, "examples"_s);
For(examples){
if(it.is_directory) continue;
compile_file(it.absolute_path, COMPILE_AND_RUN | COMPILE_TESTING);
}
}
return 0;
}